If you configure job exception handling in your queues, LSF detects the following job exceptions:
Job underrun — jobs end too soon (run time is less than expected). Underrun jobs are detected when a job exits abnormally
Job overrun — job runs too long (run time is longer than expected). By default, LSF checks for overrun jobs every 1 minute. Use EADMIN_TRIGGER_DURATION in lsb.params to change how frequently LSF checks for job overrun.
Idle job — running job consumes less CPU time than expected (in terms of CPU time/runtime). By default, LSF checks for idle jobs every 1 minute. Use EADMIN_TRIGGER_DURATION in lsb.params to change how frequently LSF checks for idle jobs.
You can configure your queues to detect job exceptions. Use the following parameters:
Specify a threshold for idle jobs. The value should be a number between 0.0 and 1.0 representing CPU time/runtime. If the job idle factor is less than the specified threshold, LSF invokes eadmin to trigger the action for a job idle exception.
Specify a threshold for job overrun. If a job runs longer than the specified run time, LSF invokes eadmin to trigger the action for a job overrun exception.
Specify a threshold for job underrun. If a job exits before the specified number of minutes, LSF invokes eadmin to trigger the action for a job underrun exception.