restarts checkpointed jobs
Restarts a checkpointed job using the checkpoint files saved in checkpoint_dir/last_job_ID/. Only jobs that have been successfully checkpointed can be restarted.
Jobs are re-submitted and assigned a new job ID. The checkpoint directory is renamed using the new job ID, checkpoint_dir/new_job_ID/.
The file path of the checkpoint directory can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.
By default, jobs are restarted with the same output file and file transfer specifications, job name, window signal value, checkpoint directory and period, and rerun options as the original job.
To restart a job on another host, both hosts must be binary compatible, run the same OS version, have access to the executable, have access to all open files (LSF must locate them with an absolute path name), and have access to the checkpoint directory.
The environment variable LSB_RESTART is set to Y when a job is restarted.
LSF invokes the erestart(8) executable found in LSF_SERVERDIR to perform the restart.
Only the bsub options listed here can be used with brestart.
Like bsub, brestart calls the master esub (mesub), which invokes any mandatory esub executables configured by an LSF administrator, and any executable named esub (without .application) if it exists in LSF_SERVERDIR. Only esub executables invoked by bsub can change the job environment on the submission host. An esub invoked by brestart cannot change the job environment.