Configuration to modify preemptive scheduling behavior

There are configuration parameters that modify various aspects of preemptive scheduling behavior, by

  • Modifying the selection of the queue to preempt jobs from

  • Modifying the selection of the job to preempt

  • Modifying preemption of backfill and exclusive jobs

  • Modifying the way job slot limits are calculated

  • Modifying the number of jobs to preempt for a parallel job

  • Modifying the control action applied to preempted jobs

  • Control how many times a job can be preempted

Configuration to modify selection of queue to preempt


File

Parameter

Syntax and description

lsb.queues

PREEMPTION

PREEMPTION=PREEMPTIVE[low_queue+pref …]
  • Jobs in this queue can preempt running jobs from the specified queues, starting with jobs in the queue with the highest value set for preference

PREEMPTION=PREEMPTABLE[hi_queue …]
  • Jobs in this queue can be preempted by jobs from the specified queues

PRIORITY=integer

  • Sets the priority for this queue relative to all other queues

  • The higher the priority value, the more likely it is that jobs from this queue may preempt jobs from other queues, and the less likely it is for jobs from this queue to be preempted by jobs from other queues


Configuration to modify selection of job to preempt


Files

Parameter

Syntax and description

lsb.params

lsb.applications

PREEMPT_FOR

PREEMPT_FOR=LEAST_RUN_TIME

  • Preempts the job that has been running for the shortest time

NO_PREEMPT_RUN_TIME

NO_PREEMPT_RUN_TIME=%
  • Prevents preemption of jobs that have been running for the specified percentage of minutes, or longer

  • If NO_PREEMPT_RUN_TIME is specified as a percentage, the job cannot be preempted after running the percentage of the job duration. For example, if the job run limit is 60 minutes and NO_PREEMPT_RUN_TIME=50%, the job cannot be preempted after it running 30 minutes or longer.

  • If you specify percentage for NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications), or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)

NO_PREEMPT_FINISH_TIME

NO_PREEMPT_FINISH_TIME=%
  • Prevents preemption of jobs that will finish within the specified percentage of minutes.

  • If NO_PREEMPT_FINISH_TIME is specified as a percentage, the job cannot be preempted if the job finishes within the percentage of the job duration. For example, if the job run limit is 60 minutes and NO_PREEMPT_FINISH_TIME=10%, the job cannot be preempted after it running 54 minutes or longer.

  • If you specify percentage for NO_PREEMPT_RUN_TIME, requires a run time (bsub -We or RUNTIME in lsb.applications), or run limit to be specified for the job (bsub -W, or RUNLIMIT in lsb.queues, or RUNLIMIT in lsb.applications)

lsb.params

lsb.queues

lsb.applications

MAX_TOTAL_TIME_PREEMPT

MAX_TOTAL_TIME_PREEMPT=minutes

  • Prevents preemption of jobs that already have an accumulated preemption time of minutes or greater.

  • The accumulated preemption time is reset in the following cases:

    • Job status becomes EXIT or DONE

    • Job is re-queued

    • Job is re-run

    • Job is migrated and restarted

  • MAX_TOTAL_TIME_PREEMPT does not affect preemption triggered by advance reservation or Platform License Scheduler.

  • Accumulated preemption time does not include preemption by advance reservation or Platform License Scheduler.

NO_PREEMPT_INTERVAL

NO_PREEMPT_INTERVAL=minutes
  • Prevents preemption of jobs until after an uninterrupted run time interval of minutes since the job was dispatched or last resumed.

  • NO_PREEMPT_INTERVAL does not affect preemption triggered by advance reservation or Platform License Scheduler.


Configuration to modify preemption of backfill and exclusive jobs


File

Parameter

Syntax and description

lsb.params

PREEMPT_JOBTYPE

PREEMPT_JOBTYPE=BACKFILL

  • Enables preemption of backfill jobs.

  • Requires the line PREEMPTION=PREEMPTABLE in the queue definition.

  • Only jobs from queues with a higher priority than queues that define resource or slot reservations can preempt jobs from backfill queues.

PREEMPT_JOBTYPE=EXCLUSIVE

  • Enables preemption of and preemption by exclusive jobs.

  • Requires the line PREEMPTION=PREEMPTABLE or PREEMPTION=PREEMPTIVE in the queue definition.

  • Requires the definition of LSB_DISABLE_LIMLOCK_EXCL in lsf.conf.

PREEMPT_JOBTYPE=EXCLUSIVE BACKFILL

  • Enables preemption of exclusive jobs, backfill jobs, or both.

lsf.conf

LSB_DISABLE_LIMLOCK_EXCL

LSB_DISABLE_LIMLOCK_EXCL=y
  • Enables preemption of exclusive jobs.

  • For a host running an exclusive job:
    • lsload displays the host status ok.

    • bhosts displays the host status closed.

    • Users can run tasks on the host using lsrun or lsgrun. To prevent users from running tasks during execution of an exclusive job, the parameter LSF_DISABLE_LSRUN=y must be defined in lsf.conf.

  • Changing this parameter requires a restart of all sbatchds in the cluster (badmin hrestart). Do not change this parameter while exclusive jobs are running.


Configuration to modify how job slot usage is calculated


File

Parameter

Syntax and description

lsb.params

PREEMPT_FOR

PREEMPT_FOR=GROUP_JLP

  • Counts only running jobs when evaluating if a user group is approaching its per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=GROUP_MAX

  • Counts only running jobs when evaluating if a user group is approaching its total job slot limit (SLOTS, PER_USER=all, and HOSTS in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=HOST_JLU

  • Counts only running jobs when evaluating if a user or user group is approaching its per-host job slot limit (SLOTS, PER_USER=all, and HOSTS in the lsb.resources file), ignoring suspended jobs

PREEMPT_FOR=USER_JLP

  • Counts only running jobs when evaluating if a user is approaching their per-processor job slot limit (SLOTS_PER_PROCESSOR, USERS, and PER_HOST=all in the lsb.resources file)

  • Ignores suspended jobs when calculating the per-processor job slot limit for individual users


Configuration to modify preemption of parallel jobs


File

Parameter

Syntax and description

lsb.params

PREEMPT_FOR

PREEMPT_FOR=MINI_JOB

  • Optimizes preemption of parallel jobs by preempting only enough low-priority parallel jobs to start the high-priority parallel job

PREEMPT_FOR=OPTIMAL_MINI_JOB

  • Optimizes preemption of parallel jobs by preempting only low-priority parallel jobs based on the least number of jobs that will be suspended to allow the high-priority parallel job to start


Configuration to modify the control action applied to preempted jobs


File

Parameter

Syntax and description

lsb.queues

TERMINATE_WHEN

TERMINATE_WHEN=PREEMPT

  • Changes the default control action of SUSPEND to TERMINATE so that LSF terminates preempted jobs


Configuration to control how many times a job can be preempted

By default, if preemption is enabled, there is actually no guarantee that a job will ever actually complete. A lower priority job could be preempted again and again, and ultimately end up being killed due to a run limit.

Limiting the number of times a job can be preempted is configured cluster-wide (lsb.params), at the queue level (lsb.queues), and at the application level (lsb.applications). MAX_JOB_PREEMPT in lsb.applications overrides lsb.queues, and lsb.queues overrides lsb.params configuration.


Files

Parameter

Syntax and description

lsb.params

lsb.queues

lsb.applications

MAX_JOB_PREEMPT

MAX_JOB_PREEMPT=integer
  • Specifies the maximum number of times a job can be preempted.

  • Specify a value within the following ranges:

    0 < MAX_JOB_PREEMPT < INFINIT_INT

    INFINIT_INT is defined in lsf.h

  • By default, the number of preemption times is unlimited.


When MAX_JOB_ PREEMPT is set, and a job is preempted by higher priority job, the number of job preemption times is set to 1. When the number of preemption times exceeds MAX_JOB_ PREEMPT, the job will run to completion and cannot be preempted again.

The job preemption limit times is recovered when LSF is restarted or reconfigured.