Displays accounting statistics about finished jobs.
Displays a summary of accounting statistics for all finished jobs (with a DONE or EXIT status) submitted by the user who invoked the command, on all hosts, projects, and queues in the LSF system. bacct displays statistics for all jobs logged in the current LSF accounting log file: LSB_SHAREDIR/cluster_name/logdir/lsb.acct.
Statistics not reported by bacct but of interest to individual system administrators can be generated by directly using awk or perl to process the lsb.acct file.
The throughput (T) of the LSF system, certain hosts, or certain queues is calculated by the formula:
You can use the option -C time0,time1 to specify the Start time as time0 and the End time as time1. In this way, you can examine throughput during a specific time period.
Jobs involved in the throughput calculation are only those being logged (that is, with a DONE or EXIT status). Jobs that are running, suspended, or that have never been dispatched after submission are not considered, because they are still in the LSF system and not logged in lsb.acct.
The total throughput of the LSF system can be calculated by specifying -u all without any of the -m, -q, -S, -D or job_ID options. The throughput of certain hosts can be calculated by specifying -u all without the -q, -S, -D or job_ID options. The throughput of certain queues can be calculated by specifying -u all without the -m, -S, -D or job_ID options.
bacct does not show local pending batch jobs killed using bkill -b. bacct shows MultiCluster jobs and local running jobs even if they are killed using bkill -b.
Displays accounting statistics for successfully completed jobs (with a DONE status).
Displays accounting statistics for exited jobs (with an EXIT status).
Long format. Displays detailed information for each job in a multiline format.
If the job was submitted with bsub -K, the -l option displays Synchronous Execution.
Displays jobs that have triggered a job exception (overrun, underrun, idle, runtime_est_exceeded). Use with the -l option to show the exception status for individual jobs.
Displays accounting information about jobs submitted to the specified application profile. You must specify an existing application profile configured in lsb.applications.
Displays accounting statistics for jobs that completed or exited during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
Displays accounting statistics for jobs dispatched during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
Searches the specified job log file for accounting statistics. Specify either an absolute or relative path.
The specified file path can contain up to 4094 characters for UNIX, or up to 512 characters for Windows.
Displays accounting statistics for jobs belonging to the specified License Scheduler projects. If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
Displays accounting statistics for jobs dispatched to the specified hosts.
If a list of hosts is specified, host names must be separated by spaces and enclosed in quotation marks (") or (’).
Normalizes CPU time by the CPU factor of the specified host or host model, or by the specified CPU factor.
If you use bacct offline by indicating a job log file, you must specify a CPU factor.
Displays accounting statistics for jobs belonging to the specified projects. If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’). You cannot use one double quote and one single quote to enclose the list.
Displays accounting statistics for jobs submitted to the specified queues.
If a list of queues is specified, queue names must be separated by spaces and enclosed in quotation marks (") or (’).
Displays accounting statistics for jobs submitted during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
Displays accounting statistics for jobs that ran under the specified service class.
If a default system service class is configured with ENABLE_DEFAULT_EGO_SLA in lsb.params but not explicitly configured in lsb.applications, bacct -sla service_class_name displays accounting information for the specified default service class.
Displays accounting statistics for the specified advance reservation IDs, or for all reservation IDs if the keyword all is specified.
A list of reservation IDs must be separated by spaces and enclosed in quotation marks (") or (’).
The -U option also displays historical information about reservation modifications.
When combined with the -U option, -u is interpreted as the user name of the reservation creator. For example:
shows all the advance reservations created by user user2.
Without the -u option, bacct -U shows all advance reservation information about jobs submitted by the user.
In a MultiCluster environment, advance reservation information is only logged in the execution cluster, so bacct displays advance reservation information for local reservations only. You cannot see information about remote reservations. You cannot specify a remote reservation ID, and the keyword all only displays information about reservations in the local cluster.
Displays accounting statistics for jobs submitted by the specified users, or by all users if the keyword all is specified.
If a list of users is specified, user names must be separated by spaces and enclosed in quotation marks (") or (’). You can specify both user names and user IDs in the list of users.
Displays accounting statistics for jobs with the specified job IDs.
Statistics on jobs. The following fields are displayed:
The total, average, minimum, and maximum statistics are on all specified jobs.
The wait time is the elapsed time from job submission to job dispatch.
The turnaround time is the elapsed time from job submission to job completion.
The hog factor is the amount of CPU time consumed by a job divided by its turnaround time.
The throughput is the number of completed jobs divided by the time period to finish these jobs (jobs/hour).
In addition to the default format SUMMARY, displays the following fields:
Name of the user who submitted the job. If LSF fails to get the user name by getpwuid, the user ID is displayed.
The job name assigned by the user, or the command string assigned by default at job submission with bsub. If the job name is too long to fit in this field, then only the latter part of the job name is displayed.
The displayed job name or job command can contain up to 4094 characters.
In addition to the fields displayed by default in SUMMARY and by -b, displays the following fields:
Status that indicates the job was either successfully completed (DONE) or exited (EXIT).
Time when the job was dispatched to run on the execution hosts.
Average hog factor, equal to "CPU time" / "turnaround time".
Maximum resident memory usage of all processes in a job. By default, memory usage is shown in MB. Use LSF_UNIT_FOR_LIMITS in lsf.conf to specify a larger unit for display (MB, GB, TB, PB, or EB).
Maximum virtual memory usage of all processes in a job. By default, swap space is shown in MB. Use LSF_UNIT_FOR_LIMITS in lsf.conf to specify a larger unit for display (MB, GB, TB, PB, or EB).
File from which the job reads its standard input (see bsub -i input_file).
File to which the job writes its standard output (see bsub -o output_file).
File in which the job stores its standard error output (see bsub -e err_file).
The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.
The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.
The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.
The job is running longer than the number of minutes specified by the runtime estimation and a job exception has been triggered.
Job was submitted with the -K option. LSF submits the job and waits for the job to complete.
The job description assigned by the user at job submission with bsub. This field is omitted if no job description has been assigned.
The displayed job description can contain up to 4094 characters.
When LSF detects that a job is terminated, bacct -l displays one of the following termination reasons. The corresponding integer value logged to the JOB_FINISH record in lsb.acct is given in parentheses.
TERM_CWD_NOTEXIST: current working directory is not accessible or does not exist on the execution host (25)
TERM_CPULIMIT: Job killed after reaching LSF CPU usage limit (12)
TERM_EXTERNAL_SIGNAL: Job killed by a signal external to LSF (17)
TERM_FORCE_ADMIN: Job killed by root or LSF administrator without time for cleanup (9)
TERM_FORCE_OWNER: Job killed by owner without time for cleanup (8)
TERM_MEMLIMIT: Job killed after reaching LSF memory usage limit (16)
TERM_PROCESSLIMIT: Job killed after reaching LSF process limit (7)
TERM_REQUEUE_ADMIN: Job killed and requeued by root or LSF administrator (11)
TERM_RUNLIMIT: Job killed after reaching LSF run time limit (5)
TERM_SLURM: Job terminated abnormally in SLURM (node failure) (22)
TERM_SWAP: Job killed after reaching LSF swap usage limit (20)
TERM_THREADLIMIT: Job killed after reaching LSF thread limit (21)
TERM_UNKNOWN: LSF cannot determine a termination reason—0 is logged but TERM_UNKNOWN is not displayed (0)
bacct
Accounting information about jobs that are:
- submitted by users user1.
- accounted on all projects.
- completed normally or exited.
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 60 Total number of exited jobs: 118
Total CPU time consumed: 1011.5 Average CPU time consumed: 5.7
Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.0
Total wait time in queues: 134598.0
Average wait time in queue: 756.2
Maximum wait time in queue: 7069.0 Minimum wait time in queue: 0.0
Average turnaround time: 3585 (seconds/job)
Maximum turnaround time: 77524 Minimum turnaround time: 6
Average hog factor of a job: 0.00 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00
Total throughput: 0.67 (jobs/hour) during 266.18 hours
Beginning time: Aug 8 15:48 Ending time: Aug 19 17:59
bacct -x -l
Accounting information about jobs that are:
- submitted by users user1,
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <1743>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command<sleep 30>
Mon Aug 11 18:16:17: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>;
Mon Aug 11 18:17:22: Dispatched to <hostC>;
Mon Aug 11 18:18:54: Completed <done>.
EXCEPTION STATUS: underrun
Accounting information about this job:
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 65 157 done 0.0012 4M 5M
------------------------------------------------------------------------------
Job <1948>, User <user1>, Project <default>, Status <DONE>, Queue <normal>,Command <sleep 550>, Job Description <This job is a test job.>
Tue Aug 12 14:15:03: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>;
Tue Aug 12 14:15:15: Dispatched to <hostC>;
Tue Aug 12 14:25:08: Completed <done>.
EXCEPTION STATUS: overrun idle
Accounting information about this job:
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.20 12 605 done 0.0003 4M 5M
------------------------------------------------------------------------------
Job <1949>, User <user1>, Project <default>, Status <DONE>, Queue <normal>,Command <sleep 400>
Tue Aug 12 14:26:11: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>;
Tue Aug 12 14:26:18: Dispatched to <hostC>;
Tue Aug 12 14:33:16: Completed <done>.
EXCEPTION STATUS: idle
Accounting information about this job:
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.17 7 425 done 0.0004 4M 5M
Job <719[14]>, Job Name <test[14]>, User <user1>, Project <default>, Status <EXIT>, Queue <normal>, Command </home/user1/job1>, Job Description <This job is another test job.>
Mon Aug 18 20:27:44: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>;
Mon Aug 18 20:31:16: [14] dispatched to <hostA>;
Mon Aug 18 20:31:18: Completed <exit>.
EXCEPTION STATUS: underrun
Accounting information about this job:
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 212 214 exit 0.0009 2M 4M
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 45 Total number of exited jobs: 56
Total CPU time consumed: 1009.1 Average CPU time consumed: 10.0
Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.1
Total wait time in queues: 116864.0
Average wait time in queue: 1157.1
Maximum wait time in queue: 7069.0 Minimum wait time in queue: 7.0
Average turnaround time: 1317 (seconds/job)
Maximum turnaround time: 7070 Minimum turnaround time: 10
Average hog factor of a job: 0.01 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00
Total throughput: 0.59 (jobs/hour) during 170.21 hours
Beginning time: Aug 11 18:18 Ending time: Aug 18 20:31
bacct -U user1#2
Accounting for:
- advanced reservation IDs: user1#2
- advanced reservations created by user1
-----------------------------------------------------------------------------
RSVID TYPE CREATOR USER NCPUS RSV_HOSTS TIME_WINDOW
user1#2 user user1 user1 1 hostA:1 9/16/17/36-9/16/17/38
SUMMARY:
Total number of jobs: 4
Total CPU time consumed: 0.5 second
Maximum memory of a job: 4.2 MB
Maximum swap of a job: 5.2 MB
Total duration time: 0 hour 2 minute 0 second
When a job finishes, LSF reports the last job termination action it took against the job and logs it into lsb.acct.
If a running job exits because of node failure, LSF sets the correct exit information in lsb.acct, lsb.events, and the job output file.
bacct -l 7265
Accounting information about jobs that are:
- submitted by all users.
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <7265>, User <lsfadmin>, Project <default>, Status <EXIT>, Queue <normal>, Command <srun sleep 100000>, Job Description <This job is also a test job.>
Thu Sep 16 15:22:09: Submitted from host <hostA>, CWD <$HOME>;
Thu Sep 16 15:22:20: Dispatched to 4 Hosts/Processors <4*hostA>;
Thu Sep 16 15:22:20: slurm_id=21793;ncpus=4;slurm_alloc=n[13-14];
Thu Sep 16 15:23:21: Completed <exit>; TERM_RUNLIMIT: job killed after reaching LSF run time limit.
Accounting information about this job:
Share group charged </lsfadmin>
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.04 11 72 exit 0.0006 0K 0K
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 0 Total number of exited jobs: 1
Total CPU time consumed: 0.0 Average CPU time consumed: 0.0
Maximum CPU time of a job: 0.0 Minimum CPU time of a job: 0.0
Total wait time in queues: 11.0
Average wait time in queue: 11.0
Maximum wait time in queue: 11.0 Minimum wait time in queue: 11.0
Average turnaround time: 72 (seconds/job)
Maximum turnaround time: 72 Minimum turnaround time: 72
Average hog factor of a job: 0.00 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.00 Minimum hog factor of a job: 0.00
Additional allocation on num_hosts Hosts/Processors host_list
For example, assume, a job submitted as
and the initial allocation is on hostA and hostB. The first resize request is allocated on hostC and hostD. A second resize request is allocated on hostE. bacct -l displays:
bacct -l 205
Accounting information about jobs that are:
- submitted by all users.
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
-----------------------------------------------------------------------------
Job <1150>, User <user2>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 10>, Job Description <This job is a test job.>
Mon Jun 2 11:42:00: Submitted from host <hostA>, CWD <$HOME>;
Mon Jun 2 11:43:00: Dispatched to 2 Hosts/Processors <hostA> <hostB>;
Mon Jun 2 11:43:52: Additional allocation on 2 Hosts/Processors <hostC> <hostD>;
Mon Jun 2 11:44:55: Additional allocation on <hostE>;
Mon Jun 2 11:51:40: Completed <done>.
...