Knowledge Center         Contents    Previous  Next    Index  
Platform Computing Corp.

Pre-Execution and Post-Execution Commands

Jobs can be submitted with optional pre- and post-execution commands. A pre- or post-execution command is an arbitrary command to run before the job starts or after the job finishes. Pre- and post-execution commands are executed in a separate environment from the job.

Contents

About Pre-Execution and Post-Execution Commands

Each batch job can be submitted with optional pre- and post-execution commands. Pre- and post-execution commands can be any executable command lines to be run before a job is started or after a job finishes.

Some batch jobs require resources that LSF does not directly support. For example, appropriate pre- and/or post-execution commands can be used to handle various situations:

By default, the pre- and post-execution commands are run under the same user ID, environment, and home and working directories as the batch job. If the command is not in your normal execution path, the full path name of the command must be specified.

For parallel jobs, the command is run on the first selected host.

The command path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.

Pre-execution commands

Pre-execution commands support job starting decisions which cannot be configured directly in LSF. LSF supports job-level, queue-level, and application-level (lsb.applications) pre-execution.

The pre-execution command returns information to LSF using its exit status. When a pre-execution command is specified, the job is held in the queue until the specified pre-execution command returns exit status zero (0).

If the pre-execution command exits with non-zero status, the batch job is not dispatched. The job goes back to the PEND state, and LSF tries to dispatch another job to that host. While the job is pending, other jobs can proceed ahead of the waiting job. The next time LSF tries to dispatch jobs this process is repeated.

If the pre-execution command exits with a value of 99, the job does not go back to the PEND state, it exits. This gives you flexibility to abort the job if the pre-execution command fails.

LSF assumes that the pre-execution command runs without side effects. For example, if the pre-execution command reserves a software license or other resource, you must not reserve the same resource more than once for the same batch job.

Post-execution commands

If a post-execution command is specified, then the command is run after the job is finished regardless of the exit state of the job.

Post-execution commands are typically used to clean up some state left by the pre-execution and the job execution. LSF supports job-level, queue-level, and application-level (lsb.applications) post-execution.

Job-level commands

The bsub -E option specifies an arbitrary command to run before starting the batch job. When LSF finds a suitable host on which to run a job, the pre-execution command is executed on that host. If the pre-execution command runs successfully, the batch job is started.

The bsub -Ep option specifies job-level post-execution commands to run on the execution host after the job finishes.

Queue-level and application-level commands

In some situations (for example, license checking), it is better to specify a queue-level or application-level pre-execution command instead of requiring every job be submitted with the -E option of bsub.

Queue-level pre-execution commands run before application-level pre-execution commands. Job level pre-execution commands (bsub -E) override application-level pre-execution commands.

Application level pre-execution commands run on the execution host before the job associated with the application profile is dispatched on an execution host.

When a job finishes, the application-level post-execution commands run, followed by queue-level post-execution commands if any.

Application-level post-execution commands run on the execution host after the job associated with the application profile has finished running on the execution host. They also run if the PRE_EXEC command exits with a 0 exit status, but the job execution environment failed to be set up.

Post-execution job states

Some jobs may not be considered complete until some post-job processing is performed. For example, a job may need to exit from a post-execution job script, clean up job files, or transfer job output after the job completes.

By default, the DONE or EXIT job states do not indicate whether post-processing is complete, so jobs that depend on processing may start prematurely. Use the post_done and post_err keywords on the bsub -w command to specify job dependency conditions for job post-processing. The corresponding job states POST_DONE and POST_ERR indicate the state of the post-processing.

The bhist command displays the POST_DONE and POST_ERR states. The resource usage of post-processing is not included in the job resource usage.

After the job completes, you cannot perform any job control on the post-processing. Post-processing exit codes are not reported to LSF.

Configuring Pre- and Post-Execution Commands

Pre-execution commands can be configured at the job level, in queues, or in application profiles.

Post-execution commands can be configured at the job level, in queues or in application profiles.

Order of command execution

Pre-execution commands run in the following order:

  1. The queue-level command
  2. The application-level or job-level command. If you specify a command at both the application and job levels, the job-level command overrides the application-level command; the application-level command is ignored.

If a pre-execution command is specified at the ...
Then the commands execute in the order of ...
Queue, application, and job levels
  1. Queue level
  2. Job level
Queue and application levels
  1. Queue level
  2. Application level
Queue and job levels
  1. Queue level
  2. Job level
Application and job levels
  1. Job level

Post-execution commands run in the following order:

  1. The application-level or job-level command. If you specify a command at both the application and job levels, the job-level command overrides the application-level command; the application-level command is ignored.
  2. The queue-level command

If both application-level (POST_EXEC in lsb.applications) and job-level post-execution commands are specified, job level post-execution overrides application-level post-execution commands.

If a post-execution command is specified at the ...
Then the commands execute in the order of ...
Queue, application, and job levels
  1. Job level
  2. Queue level
Queue and application levels
  1. Application level
  2. Queue level
Queue and job levels
  1. Job level
  2. Queue level

Job-level commands

Job-level pre-execution and post-execution commands require no configuration. Use the bsub -E option to specify an arbitrary command to run before the job starts. Use the bsub -Ep option to specify an arbitrary command to run after the job finishes running.

Example

The following example shows a batch job that requires a tape drive. The user program tapeCheck exits with status zero if the specified tape drive is ready:

bsub -E "/usr/share/bin/tapeCheck /dev/rmt01" myJob

Queue-level commands

Use the PRE_EXEC and POST_EXEC keywords in the queue definition (lsb.queues) to specify pre- and post-execution commands.

The following points should be considered when setting up pre- and post-execution commands at the queue level:

Example

The following queue specifies the pre-execution command /usr/share/lsf/pri_prexec and the post-execution command /usr/share/lsf/pri_postexec.

Begin Queue
QUEUE_NAME     = priority
PRIORITY       = 43
NICE           = 10
PRE_EXEC       = /usr/share/lsf/pri_prexec
POST_EXEC      = /usr/share/lsf/pri_postexec
End Queue 

See the lsb.queues template file for additional queue examples.

Application-level commands

Use the PRE_EXEC and POST_EXEC keywords in the application profile definition (lsb.applications) to specify pre- and post-execution commands.

The following points should be considered when setting up pre- and post-execution commands at the application level:

Example

Begin Application 
NAME         = catia 
DESCRIPTION  = CATIA V5 
CPULIMIT     = 24:0/hostA      # 24 hours of host hostA 
FILELIMIT    = 20000 
DATALIMIT    = 20000           # jobs data segment limit 
CORELIMIT    = 20000 
PROCLIMIT    = 5               # job processor limit 
PRE_EXEC       = /usr/share/lsf/catia_prexec 
POST_EXEC      = /usr/share/lsf/catia_postexec 
REQUEUE_EXIT_VALUES = 55 34 78 
End Application 

See the lsb.applications template file for additional application profile examples.

Pre- and post-execution on UNIX and Linux

The entire contents of the configuration line of the pre- and post-execution commands are run under /bin/sh -c, so shell features can be used in the command.

For example, the following is valid:

PRE_EXEC = /usr/share/lsf/misc/testq_pre >> /tmp/pre.out
POST_EXEC = /usr/share/lsf/misc/testq_post | grep -v "Hey!" 

The pre- and post-execution commands are run in /tmp.

Standard input and standard output and error are set to /dev/null. The output from the pre- and post-execution commands can be explicitly redirected to a file for debugging purposes.

The PATH environment variable is set to:

PATH='/bin /usr/bin /sbin /usr/sbin' 

Pre- and post-execution on Windows

The pre- and post-execution commands are run under cmd.exe /c.

note:  
For pre- and post-execution commands that execute on a Windows Server 2003, x64 Edition platform, users must have "Read" and "Execute" privileges for cmd.exe.

Standard input and standard output and error are set to NULL. The output from the pre- and post-execution commands can be explicitly redirected to a file for debugging purposes.

Setting a pre- and post-execution user ID

By default, both the pre- and post-execution commands are run as the job submission user. Use the LSB_PRE_POST_EXEC_USER parameter in lsf.sudoers to specify a different user ID for queue-level and application-level pre- and post-execution commands.

Example

If the pre- or post-execution commands perform privileged operations that require root permission, specify:

LSB_PRE_POST_EXEC_USER=root 

See the Platform LSF Configuration Reference for information about the lsf.sudoers file.

Including job post-execution in job finish status reporting

By default, LSF releases resources for a job as soon as the job is finished and when sbatchd reports job finish status (DONE or EXIT) to mbatchd. Post-execution processing is not considered part of job processing. This makes it possible for a new job to be started before post-execution processing for a previous job is complete.

There are situations where you do not want the first job's post-execution affecting the second job's execution. Or the execution of a second job might crucially depend on the completion of post-execution of the previous job.

In other cases, you may want to include job post-execution in job accounting processes, or if the post-exacerbation is CPU intensive, you might not want a second job running at the same time. Finally, system configuration required by the original job may be changed or removed by a new job, which could prevent the first job from finishing normally.

To enable all associated processing to complete before LSF reports job finish status, configure JOB_INCLUDE_POSTPROC=Y in an application profile in lsb.applications or cluster wide in lsb.params.

When JOB_INCLUDE_POSTPROC is set:

For job history and job accounting, the job CPU time and run time will also include the post-execution CPU time and run time.

Limitations and side-effects

Job query commands (bjobs, bhist) show that the job remains in RUN state until the post-execution processing is finished, even though the job itself has finished. job control commands (bstop, bkill, bresume) will have no effect.

Rerunnable jobs may rerun after they have actually finished because the host became unavailable before post-execution processing finished, but the mbatchd considers the job still in RUN state.

Job preemption is delayed until post-execution processing is finished.

Post-execution on SGI cpusets

Post-execution processing on SGI cpusets behave differently from previous releases. If JOB_INCLUDE_POSTPROC=Y is specified in lsb.applications or cluster wide in lsb.params, post-execution processing is not attached to the job cpuset, and Platform LSF does not release the cpuset until post-execution processing has finished.

Preventing job overlap on hosts

You can use JOB_INCLUDE_POSTPROC to ensure that there is no execution overlap among running jobs. For example, you may have pre-execution processing to create a user execution environment at the desktop (mount a disc for the user, create rlogin permissions, etc.) Then you configure post-execution processing to clean up the user execution environment set by the pre-exec.

If the post-execution for one job is still running when a second job is dispatched, pre-execution processing that sets up the user environment for the next job may not be able to run correctly because the previous job's environment has not yet been cleaned up by its post-exec.

You should configure jobs to run exclusively to prevent the actual jobs from not overlapping, but in this case, you also need to configure post-execution to be included in job finish status reporting.

Setting a post-execution timeout

Configure JOB_POSTPROC_TIMEOUT in an application profile in lsb.applications or cluster wide in lsb.params to control how long post-execution processing is allowed to run.

JOB_POSTPROC_TIMEOUT specifies a timeout in minutes for job post-execution processing. If post-execution processing takes longer than the timeout, sbatchd reports the post-execution has failed (POST_ERR status), and kills the process group of the job's post-execution processes.

The specified timeout must be greater than zero.

If JOB_INCLUDE_POSTPROC is enabled in the application profile or cluster wide in lsb.params, and sbatchd kills the post-execution processes because the timeout has been reached, the CPU time of the post-execution processing is set to 0, and the job CPU time will not include the CPU time of the post-execution processing.

Controlling how many times pre-execution commands are retried

By default, if job pre-execution fails, LSF retries the job automatically.

Configure MAX_PREEXEC_RETRY, LOCAL_MAX_PREEXEC_RETRY, or REMOTE_MAX_PREEXEC_RETRY to limit the number of times LSF retries job pre-execution. Pre-execution retry is configured cluster-wide (lsb.params), at the queue level (lsb.queues), and at the application level (lsb.applications). Pre-execution retry configured in lsb.applications overrides lsb.queues, and lsb.queues overrides lsb.params configuration.


Platform Computing Inc.
www.platform.com
Knowledge Center         Contents    Previous  Next    Index