Administrative tool for LSF.
badmin provides a set of subcommands to control and monitor LSF. If no subcommands are supplied for badmin, badmin prompts for a subcommand from standard input.
Information about each subcommand is available through the help command.
The badmin subcommands include privileged and non-privileged subcommands. Privileged subcommands can only be invoked by root or LSF administrators. Privileged subcommands are:
The configuration file lsf.sudoers(5) must be set to use the privileged command hstartup by a non-root user.
All other commands are non-privileged commands and can be invoked by any LSF user. If the privileged commands are to be executed by the LSF administrator, badmin must be installed, because it needs to send the request using a privileged port.
For subcommands for which multiple hosts can be specified, do not enclose the host names in quotation marks.
Checks LSF configuration files located in the LSB_CONFDIR/cluster_name/configdir directory, and checks LSF_ENVDIR/lsf.licensescheduler.
The LSB_CONFDIR variable is defined in lsf.conf (see lsf.conf(5)), which is in LSF_ENVDIR or /etc (if LSF_ENVDIR is not defined).
Displays full pending reason list if CONDENSE_PENDING_REASONS=Y is set in lsb.params. For example:
Configuration files are checked for errors and the results displayed to stderr. If no errors are found in the configuration files, a reconfiguration request is sent to mbatchd and configuration files are reloaded.
With this option, mbatchd is not restarted and lsb.events is not replayed. To restart mbatchd and replay lsb.events, use badmin mbdrestart.
When you issue this command, mbatchd is available to service requests while reconfiguration files are reloaded. Configuration changes made since system boot or the last reconfiguration take effect.
If warning errors are found, badmin prompts you to display detailed messages. If fatal errors are found, reconfiguration is not performed, and badmin exits.
If you add a host to a queue or to a host group or compute unit, the new host is not recognized by jobs that were submitted before you reconfigured. If you want the new host to be recognized, you must use the command badmin mbdrestart.
Resource requirements determined by the queue no longer apply to a running job after running badmin reconfig, For example, if you change the RES_REQ parameter in a queue and reconfigure the cluster, the previous queue-level resource requirements for running jobs are lost.
Verbose mode. Displays detailed messages about the status of the configuration files. Without this option, the default is to display the results of configuration file checking. All messages from the configuration file check are printed to stderr.
Disables interaction and proceeds with reconfiguration if configuration files contain no fatal errors.
Dynamically reconfigures LSF and restarts mbatchd and mbschd.
Configuration files are checked for errors and the results printed to stderr. If no errors are found, configuration files are reloaded, mbatchd and mbschd are restarted, and events in lsb.events are replayed to recover the running state of the last mbatchd. While mbatchd restarts, it is unavailable to service requests.
If warning errors are found, badmin prompts you to display detailed messages. If fatal errors are found, mbatchd and mbschd restart is not performed, and badmin exits.
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
Verbose mode. Displays detailed messages about the status of configuration files. All messages from configuration checking are printed to stderr.
Disables interaction and forces reconfiguration and mbatchd restart to proceed if configuration files contain no fatal errors.
Activates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. Jobs in a queue can be dispatched if the queue is activated.
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file.
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the queue control commands qclose, qopen, qact, and qinact, qhist displays the comment text.
Opens batch server hosts. Specify the names of any server hosts, host groups, or compute units. All batch server hosts are opened if the reserved word all is specified. If no host, host group, or compute unit is specified, the local host is assumed. A host accepts batch jobs if it is open.
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you open a host group or compute unit, each member displays with the same comment string.
Closes batch server hosts. Specify the names of any server hosts, host groups, or compute units. All batch server hosts are closed if the reserved word all is specified. If no argument is specified, the local host is assumed. A closed host does not accept any new job, but jobs already dispatched to the host are not affected. Note that this is different from a host closed by a window; all jobs on it are suspended in that case.
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you close a host group or compute unit, each member displays with the same comment string.
If dynamic host configuration is enabled, dynamically adds hosts to a host group or compute unit. After receiving the host information from the master LIM, mbatchd dynamically adds the host without triggering a reconfig.
Once the host is added to the host group or compute unit, it is considered part of that group with respect to scheduling decision making for both newly submitted jobs and for existing pending jobs.
This command fails if any of the specified host groups, compute units, or host names are not valid.
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
Dynamically deletes hosts from a host group or compute unit by triggering an mbatchd reconfig.
This command fails if any of the specified host groups, compute units, or host names are not valid.
Specify the name of the file into which timing messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log file directory.
The name of the file created has the following format:
logfile_name.daemon_name.log.host_name
On UNIX, if the specified path is not valid, the log file is created in the /tmp directory.
On Windows, if the specified path is not valid, no log file is created.
Note: Both timing and debug messages are logged in the same files.
Default: current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name.
Starts sbatchd on the specified hosts, or on all batch server hosts if the reserved word all is specified. Only root and users listed in the file lsf.sudoers(5) can use the all and -f options. These users must be able to use rsh or ssh on all LSF hosts without having to type in passwords. If no host is specified, the local host is assumed.
Displays historical events for specified hosts, or for all hosts if no host is specified. Host events are host opening and closing.
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file.
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the host control commands hclose or hopen, hhist displays the comment text.
Displays historical events for mbatchd. Events describe the starting and exiting of mbatchd.
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file.
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the mbdrestart command, mbdhist displays the comment text.
Displays historical events for all the queues, hosts and mbatchd.
Displays only those events that occurred during the period from time0 to time1. See bhist(1) for the time format. The default is to display all queue events in the event log file.
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system: LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for offline analysis.
If you specified an administrator comment with the -C option of the queue, host, and mbatchd commands, hist displays the comment text.
Displays the syntax and functionality of the specified commands.
Sets message log level for mbatchd to include additional information in log files. You must be root or the LSF administrator to use this command.
Sets timing level for mbatchd to include additional timing information in log files. You must be root or the LSF administrator to use this command.
Sets the message log level for sbatchd to include additional information in log files. You must be root or the LSF administrator to use this command.
In MultiCluster, debug levels can only be set for hosts within the same cluster. For example, you cannot set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
class_name=0 (no additional classes are logged)
debug_level=0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name
Specifies software classes for which debug messages are to be logged.
Format of class_name is the name of a class, or a list of class names separated by spaces and enclosed in quotation marks. Classes are also listed in lsf.h.
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
0 LOG_DEBUG level in parameter LSF_LOG_MASK in lsf.conf.
1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Specify the name of the file into which debugging messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file that is created has the following format:
logfile_name.daemon_name.log.host_name
On UNIX, if the specified path is not valid, the log file is created in the /tmp directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LSF_LOG_MASK and classes are reset to the value of LSB_DEBUG_MBD, LSB_DEBUG_SBD.
Optional. Sets debug settings on the specified host or hosts.
Lists of host names must be separated by spaces and enclosed in quotation marks.
Sets the timing level for sbatchd to include additional timing information in log files. You must be root or the LSF administrator to use this command.
In MultiCluster, timing levels can only be set for hosts within the same cluster. For example, you could not set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
timing_level=no timing information is recorded
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name
Specifies detail of timing information that is included in log files. Timing messages indicate the execution time of functions in the software and are logged in milliseconds.
Valid values: 1 | 2 | 3 | 4 | 5
The higher the number, the more functions in the software that are timed and whose execution time is logged. The lower numbers include more common software functions. Higher levels include all lower levels.
Specify the name of the file into which timing messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log file directory.
The name of the file created has the following format:
logfile_name.daemon_name.log.host_name
On UNIX, if the specified path is not valid, the log file is created in the /tmp directory.
On Windows, if the specified path is not valid, no log file is created.
Note: Both timing and debug messages are logged in the same files.
Default: current LSF system log file in the LSF system log file directory, in the format daemon_name.log.host_name.
Optional. Turn off temporary timing settings and reset them to the daemon starting state. The timing level is reset back to the value of the parameter for the corresponding daemon (LSB_TIME_MBD, LSB_TIME_SBD).
Sets the timing level on the specified host or hosts.
Lists of hosts must be separated by spaces and enclosed in quotation marks.
Sets message log level for mbschd to include additional information in log files. You must be root or the LSF administrator to use this command.
Sets timing level for mbschd to include additional timing information in log files. You must be root or the LSF administrator to use this command.
Display all configured parameters and their values set in lsf.conf or ego.conf that affect mbatchd and sbatchd.
In a MultiCluster environment, badmin showconf only displays the parameters of daemons on the local cluster.
Running badmin showconf from a master candidate host reaches all server hosts in the cluster. Running badmin showconf from a slave-only host may not be able to reach other slave-only hosts.
badmin showconf only displays the values used by LSF.
For example, if you define LSF_MASTER_LIST in lsf.conf, and EGO_MASTER_LIST in ego.conf, badmin showconf displays the value of EGO_MASTER_LIST.
badmin showconf displays the value of EGO_MASTER_LIST from wherever it is defined. You can define either LSF_MASTER_LIST or EGO_MASTER_LIST in lsf.conf. LIM reads lsf.conf first, and ego.conf if EGO is enabled in the LSF cluster. The value of LSF_MASTER_LIST is displayed only if EGO_MASTER_LIST is not defined at all in ego.conf.
For example, if EGO is enabled in the LSF cluster, and you define LSF_MASTER_LIST in lsf.conf, and EGO_MASTER_LIST in ego.conf, badmin showconf displays the value of EGO_MASTER_LIST in ego.conf.
If EGO is disabled, ego.conf not loaded, so whatever is defined in lsf.conf is displayed.
Dynamically enables and controls scheduler performance metric collection.
Collecting and recording performance metric data may affect the performance of LSF. Smaller sampling periods results in the lsb.streams file growing faster.
Start performance metric collection dynamically and specifies an optional sampling period in seconds for performance metric collection.
If no sampling period is specified, the default period set in SCHED_METRIC_SAMPLE_PERIOD in lsb.params is used.
Display real time performance metric information for the current sampling period