For detailed information about what’s new in Platform LSF Version 7 Update 4, visit the Platform Computing Web site to see features and benefits: http://www.platform.com/Products/platform-lsf-family/platform-lsf/features-and-benefits.
DO NOT use the UNIX and Linux upgrade steps to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 4. Follow the manual steps in the document Migrating to Platform LSF Version 7 Update 4 on UNIX and Linux to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 4 on UNIX and Linux.
See the Platform Computing Web site for information about supported operating systems and system requirements for Platform LSF: http://www.platform.com/Products/platform-lsf-family/platform-lsf/system-requirements.
Full backward compatibility: your applications will run under LSF Version 7 without changing any code.
The Platform LSF Version 7 API is fully compatible with the LSF Version 6.x. and 5.x APIs. An application linked with the LSF Version 6.x or 5.x libraries will run under LSF Version 7 without relinking.
To take full advantage of new Platform LSF Version 7 features, you should recompile your existing LSF applications with LSF Version 7.
See the LSF API Reference for more information.
lsb_submit() and lsb_modify(): Add the options SUB3_INTERACTIVE_SSH and SUB3_XJOB_SSH, SUB3_RUNTIME_ESTIMATION, SUB3_RUNTIME_ESTIMATION_ACC, SUB3_RUNTIME_ESTIMATION_PERC, SUB3_AUTO_RESIZE and SUB3_RESIZE_NOTIFY_CMD to the JobSubReq structure options3 field
lsb_readjobinfo() and lsb_readjobinfo_cond(): Add the int runtimeEstimation field to the submit structure
lsb_submit(): Add notifyCmd field to submit structure for job resize notification command to be invoked on the first execution host when a resize request has been satisfied.
lsb_launch(): Add LSF_DJOB_USE_LOGIN_SHELL and LSF_DJOB_USE_BOURNE_SHELL options to userOptions parameter to specify with shell to launch commands through (user login shell or Bourne shell (/bin/sh)).
The syntax of resource requirement selection strings has been enhanced to make resource selection more consistent and rigorous. When LSF_STRICT_RESREQ=Y is configured in lsf.conf, resource requirement strings in select sections must conform to a more strict syntax. Strict syntax checking does not apply to the other resource requirement sections (order, rusage, same, or span). However, when LSF_STRICT_RESREQ=Y in lsf.conf, LSF also rejects resource requirement strings where an rusage section contains a non-consumable resource. When LSF_STRICT_RESREQ=N, the default resource requirement selection string evaluation is performed.
LSF installation with lsfinstall has been improved to to support the LDAP environment for host name, user name, services lookup.
Remove manual steps for using a shared installation directory
In Windows master host installation, set the LSF password file directly, so cluster administrators can submit jobs without extra steps to verify LSF cluster
Add a new dialog for shared location on all hosts installation, so you can decide to use a shared location as LSF_CONFDIR or keep local configuration
Add new dialog for Windows service account, so you can choose another account instead of LocalSystem
Platform Management Console (PMC) is in a separate msi package
The Platform LSF License Scheduler flexible grid interface license management plugin is no longer supported.
The Platform Management Console allows you to submit and monitor jobs, and provides access to job reports. You can choose to submit jobs directly from the PMC, or through a separate workload management application. After logging on to the PMC, you can monitor the jobs you submitted through either the console or a workload management application.
You can submit jobs through a number of standard interfaces in the PMC. A generic interface is provided, along with several pre-configured and customizable interfaces (such as Fluent, LS-DYNA, ABAQUS, ANSYS, Nastran, and EnginFrame).
Enabling resizable jobs allows LSF to run a job with minimum and maximum slots requested and have it dynamically use the number of slots available at any given time.
By default, if a job specifies minimum and maximum slots requests (bsub -n min_slots,max_slots), LSF makes a one time allocation and schedules the job. You can configure resizable jobs so that LSF dispatches jobs as long as minimum slot request is satisfied. After the job successfully starts, LSF continues to schedule and allocate additional resources to satisfy the maximum slot request for the job.
The allocation change request may be triggered automatically or by the bresize command. For example, after the job starts, you can explicitly cancel resize allocation requests or have the job release idle resources back to the LSF.
An autoresizable job is a resizable job with a minimum and maximum slot request. LSF automatically schedules and allocates additional resources to satisfy job maximum request as the job runs.
For hosts that attempted to join the cluster but failed to communicate within the LSF_DYNAMIC_HOST_WAIT_TIME period, automatically shuts down any running daemons. Enable EGO_ENABLE_AUTO_DAEMON_SHUTDOWN in lsf.conf. This feature also works when dynamic host support is not enabled.
On UNIX hosts, you can set a log file owner for the LSF daemons (not including the mbschd) to change the default owner (LSF administrator). Changes are made to LSF_LOGFILE_OWNER in lsf.conf.
For jobs that need more than one resource before it will run, you can choose to reserve resources for pending jobs that are waiting for another resource to become available. This ensures that when the rare resource becomes available, the job will already have any other resources reserved and will therefore run right away.
See lsb.resources in the LSF Configuration Reference for more information.
The number of characters allowed in a file name, including the directory path, has been lengthened. File names can be up to and including 4094 characters. Checkpoint directories are limited to 4000 characters.
A number of commands often require you to specify host names. You can now specify host name ranges instead.
Processor binding for LSF job processes takes advantage of the power of multiple processors and multiple cores to provide hard processor binding functionality for sequential LSF jobs and parallel jobs that run on a single host. There are six values you can set LSF_BIND_JOB in lsf.conf or BIND_JOB in lsb.applications to: BALANCE, ANY, NONE, PACK, USER_CPU_LIST, USER.
The new host group administrators can open or close any hosts that belong to their host group.
User groups can now have optional administrators that can control the jobs their users submit. They can even resume their users’ jobs that were suspended by the cluster administrator.
When you have defined your queues with a list of hosts and you submit a job while specifying hosts with bsub -m, if you specify a host that does not belong to the correct queue, the job still runs on the other hosts specified. Before this enhancement, the job would have failed because it could not run on the host that does not belong to the queue. This is especially useful if you use scripts to run your jobs and are sometimes modify the host members of a queue.
The resource requirement order string now supports numeric static resources in addition to builtin and external load indices.
Event streaming is now disabled by default. To enable event streaming, define
ENABLE_EVENT_STREAM=Yin lsb.params.
The following configuration parameters and environment variables are new or changed for LSF Version 7 Update 4:
BIND_JOB: The values for BIND_JOB have changed from YES or NO to BALANCE, ANY, NONE, PACK, USER_CPU_LIST, USER to specify the processor binding policy for sequential and parallel job processes that run on a single host.
DJOB_RESIZE_GRACE_PERIOD=seconds: When a resizable job releases resources, the LSF distributed parallel job framework terminates running tasks if a host has been completely removed. A DJOB_RESIZE_GRACE_PERIOD defines a grace period in seconds for the application to clean up tasks itself before LSF forcibly terminates them.
LOCAL_MAX_PREEXEC_RETRY=integer: The maximum number of times to attempt the pre-execution command of a job on the local cluster. Specify a value between 0 < MAX_PREEXEC_RETRY < INFINIT_INT. INFINIT_INT is defined in lsf.h. By default, the number of preexec retry times is unlimited.
MAX_PREEXEC_RETRY=integer: MultiCluster job forwarding model only. The maximum number of times to attempt the pre-execution command of a job from a remote cluster. If the job's pre-execution command fails all attempts, the job is returned to the submission cluster. Specify a value between 0 < MAX_PREEXEC_RETRY < INFINIT_INT. INFINIT_INT is defined in lsf.h. The default value is 5.
REMOTE_MAX_PREEXEC_RETRY=integer: REMOTE_MAX_PREEXEC_RETRY is equivalent to MAX_PREEXEC_RETRY
RES_REQ: When LSF_STRICT_RESREQ=Y is configured in lsf.conf, resource requirement strings in select sections must conform to a more strict syntax. The strict resource requirement syntax only applies to the select section. Strict syntax checking does not apply to the other resource requirement sections (order, rusage, same, or span). However, when LSF_STRICT_RESREQ=Y in lsf.conf, LSF also rejects resource requirement strings where an rusage section contains a non-consumable resource. When LSF_STRICT_RESREQ=N, the default resource requirement selection string evaluation is performed.
RESIZABLE_JOBS=Y|N|auto: configures the resizable jobs feature in the application profile. N disables the resizable job feature in the application profile. Y enables resizable jobs in the application profile and all jobs belonging to the application are resizable by default. auto specifies that all jobs belonging to the application are autoresizable.
RESIZE_NOTIFY_CMD=notification_command: Defines an executable command to be invoked on the first execution host of a job when a resize event occurs. The maximum length of notification command is 4 KB.
MAX_EVENT_STREAM_FILE_NUMBER: Defines the maximum number of lsb.stream.utc files that mbatchd uses before logging an error message to the mbd.log file and stopping the writing of events to the lsb.stream file. The default value is 10.
USE_SUSP_SLOTS=Y | N: If USE_SUSP_SLOTS=Y, pending jobs in the lower priority queue can be dispatched to the slots released by SSUSP jobs. Set USE_SUSP_SLOTS=N to prevent pending jobs in the lower priority queue from being dispatched to the slots released by SSUSP jobs.
ENABLE_EVENT_STREAM: The default has changed from Y to N. By default, this parameter is not defined, which means that event streaming is not enabled (ENABLE_EVENT_STREAM=N).
PRIVILEGED_USER_FORCE_BKILL: A new parameter, when set to Y, allows bkill -r to be used only by root and the LSF administrator. For all other users, the -r option is ignored.
The RESERVE parameter in the ReservationUsage section has been updated to reflect the ability to reserve resources when needed.
RES_SELECT: When LSF_STRICT_RESREQ=Y is configured in lsf.conf, resource requirement strings in select sections must conform to a more strict syntax. The strict resource requirement syntax only applies to the select section.
LSF_STRICT_RESREQ=Y | N: When LSF_STRICT_RESREQ=Y, the resource requirement selection string must conform to the stricter resource requirement syntax described in Administering Platform LSF. The strict resource requirement syntax only applies to the select section. Strict syntax checking does not apply to the other resource requirement sections (order, rusage, same, or span). However, when LSF_STRICT_RESREQ=Y in lsf.conf, LSF also rejects resource requirement strings where an rusage section contains a non-consumable resource. When LSF_STRICT_RESREQ=N, the default resource requirement selection string evaluation is performed.
EGO_ENABLE_AUTO_DAEMON_SHUTDOWN: Lets you shut down daemons automatically if a host fails to join the cluster.
LSF_LOGFILE_OWNER: Lets you specify an owner of daemon log files other than the administrator (the default).
LSB_MIXED_PATH_DELIMITER: Specifies the type of delimiter that separates UNIX and Windows file paths when LSB_MIXED_PATH_ENABLE=y.
LSB_MIXED_PATH_ENABLE: Lets you specify both a UNIX and Windows path for some options of bsub.
LSF_BIND_JOB: The values for LSF_BIND_JOB have changed from YES or NO to BALANCE, ANY, NONE, PACK, USER_CPU_LIST, USER to specify the processor binding policy for sequential and parallel job processes that run on a single host.
When LSF_STRICT_RESREQ=Y in lsf.conf, LSF rejects resource requirement strings where an rusage section contains a non-consumable resource.
SLA name cannot be the same as a fairshare queue name, in addition to a host partition name or user group name.
BSUB_CHK_RESREQ=any_value: When BSUB_CHK_RESREQ is set, bsub checks the syntax of the resource requirement selection string without actually submitting the job for scheduling and dispatch. Use BSUB_CHK_RESREQ to check the compatibility of your existing resource requirement select strings against the stricter syntax enabled by LSF_STRICT_RESREQ=y in lsf.conf. LSF_STRICT_RESREQ does not need to be set to check the resource requirement selection string syntax. bsub only checks the select section of the resource requirement. Other sections in the resource requirement string are not checked.
LSB_BIND_CPU_LIST: The binding requested at job submission takes effect when LSF_BIND_JOB=USER_CPU_LIST in lsf.conf or BIND_JOB=USER_CPU_LIST in an application profile in lsb.applications. LSF makes sure that the value is in the correct format, but does not check that the value is valid for the execution hosts.
LSB_USER_BIND_JOB: The binding requested at job submission takes effect when LSF_BIND_JOB=USER in lsf.conf or BIND_JOB=USER in an application profile in lsb.applications. This value must be one of Y, BALANCE, PACK, or ANY. Any value other than Y, BALANCE, or PACK is treated as ANY.
Releases slots from a running resizable job, and cancels pending job resize allocation requests.
Use bresize release to explicitly release slots from a running job. When releasing slots from an allocation, a minimum of 1 slot on the first execution host must be retained.
Use bresize cancel to cancel a pending allocation request for the specified job ID. The active pending allocation request is generated by LSF automatically for autoresizable jobs. If job does not have active pending request, the command fails with an error message.
By default, only cluster administrators, queue administrators, root and the job owner are allowed to run bresize to change job allocations.
The long output (bapp -l) now includes the processor binding policy values for sequential and parallel job processes (BIND_JOB).
Displays resizable job information:
For JOB_NEW events, bhist displays the auto resizable attribute and resize notification command in the submission line.
For JOB_MODIFY2 events (bmod), bhist displays the auto resizable attribute and resize notification command in the submission line.
bhist displays job resize notification command information for JOB_RESIZE_NOTIFY_START, JOB_RESIZE_NOTIFY_ACCEPT, and JOB_RESIZE_NOTIFY_DONE events.
For JOB_RESIZE_RELEASE events, bhist displays job resize allocation release information.
For JOB_RESIZE_CANCEL events, bhist displays job resize allocation cancel information.
When LSF adds more resources to a running resizable job, bhosts displays the added resources. When LSF removes resources from a running resizable job, bhosts displays the updated resources.
When LSF adds more resources to a running resizable job, bjgroup displays the added resources. When LSF removes resources from a running resizable job, bjgroup -N displays the updated resources.
-WF: Displays an estimated finish time for running or pending jobs. For done or exited jobs, displays the actual finish time (FINISH_TIME).
-WL: Displays the estimated remaining run time of jobs (TIME_LEFT).
-WP: Displays the current estimated completion percentage of jobs (%COMPLETE).
For resizable jobs, bjobs -l displays the autoresizable attribute and the resize notification command.
When LSF adds more resources to a running resizable job, blimits displays the added resources. When LSF removes resources from a running resizable job, blimits displays the updated resources.
bugroup and bmgroup display related user group administrator and host group administrator information.
For resizable jobs, bmod -R "rusage[mem | swp]" only affects the resize allocation request if the job has not been dispatched.
Use the -rnc and -ar options to modify the autoresizable attribute or resize notification command for resizable jobs. You can only modify the autoresizable attribute for pending jobs (PSUSP or PEND). You can only modify the resize notification command for unfinished jobs (not DONE or EXIT jobs).
-We [hour:]minute[/host_name | /host_model]: Sets an estimated run time. Specifying a host or host model normalizes the time with the CPU factor (time/CPU factor) of the host or model.
-We+ [hour:]minute]: Sets an estimated run time that is the value you specify added to the accumulated run time. For example, if you specify -We+ 30 and the job has already run for 60 minutes, the new estimated run time is now 90 minutes.
-Wep [value]: Sets an estimated run time that is the percentage of job completion that you specify added to the accumulated run time. For example, if you specify -Wep+ 25 (meaning that the job is 25% complete) and the job has already run for 60 minutes, the new estimated run time is now 240 minutes.
When a resizable job has a resize allocation request, bqueues displays pending requests. When LSF adds more resources to a running resizable job, bqueues decreases job PEND counts and displays the added resources. When LSF removes resources from a running resizable job, bqueues displays the updated resources.
bresources can display the resource policy configured in the ReservationUsage section of lsb.resources.
If LSF_STRICT_RESREQ=y in lsf.conf, the selection string on the -R option must conform to the stricter resource requirement string syntax described in Administering Platform LSF. The strict resource requirement syntax only applies to the select section.
If LSF_STRICT_RESREQ=Y in lsf.conf, the selection string on the -R option must conform to the stricter resource requirement string syntax described in Administering Platform LSF. The strict resource requirement syntax only applies to the select section.
-a esub_application: If you have an esub that runs an interactive or X-window job and you have SSH enabled in lsf.conf, the communication between hosts is encrypted.
-IS | -ISp | -ISs | -IX: Submit an interactive job through a secure shell (ssh) Optionally, you can enable ssh in lsf.conf to encrypt the communication for interactive jobs.
-rnc resize_notification_cmd: Specify the full path of an executable to be invoked on the first execution host when the job allocation has been modified (both shrink and grow). -rnc overrides the notification command specified in the application profile (if specified). The maximum length of the notification command is 4 KB.
When a resizable job has a resize allocation request, busers displays pending requests. When LSF adds more resources to a running resizable job, busers decreases job PEND counts and displays the added resources. When LSF removes resources from a running resizable job, busers displays the updated resources.
bswitch can switch resizable jobs between queues regardless of job state. Once the job is switched, the parameters in new queue apply, including threshold configuration, run limit, CPU limit, queue-level resource requirements, etc.
Dual-core CPU license information (LSF_DUALCORE) has been removed as of this update.
The new resizable job feature adds a new JOB_RESIZE event. When there is an allocation change, LSF logs a JOB_RESIZE event after mbatchd receives a JOB_RESIZE_NOTIFY_DONE event.
A new field resizeNotifyCmd is introduced at the end of the JOB_NEW record. New submission options are used in options3: SUB3_AUTO_RESIZE and SUB3_RESIZE_NOTIFY_CMD.
A new field resizeNotifyCmd is introduced at the end of JOB_MODIFY2 record. New submission options are used in options3: SUB3_AUTO_RESIZE and SUB3_RESIZE_NOTIFY_CMD.
A new field jFlags2 is introduced at the end of the JOB_START record.
Logged when a job resize (shrink or grow) request has been sent to the first execution host.
Logged when a job resize request has been accepted from the first execution host of a job.
Logged when LSF receives a resource release request from the client.
Logged when LSF receives a resource allocation cancel request from the client.
A Session Scheduler job suspended with bstop enters USSUP state and the job cannot be killed with bkill. The out-of-box TERMINATE_CONTROL=SIGINT configuration in Session Scheduler causes only SIGINT to be sent to the job from bkill. To be terminated, the job must receive the required SIGCONT, SIGINT, SIGTERM, and SIGKILL signals. You must run bresume to cause the job to receive the correct bkill signals.
When installing License Scheduler standalone, the installer removes EGO environment variables from cshrc.lsf and profile.lsf. Specify a different LSF_TOP from the LSF installation to install standalone License Scheduler.
Access to the Platform FTP site is controlled by login name and password. If you cannot access the distribution files for download, send email to support@platform.com.
You must provide your Customer Support Number and register a user name and password on my.platform.com to download LSF.
To register at my.platform.com, click New User? and complete the registration form. If you do not know your Customer Support Number or cannot log in to my.platform.com, send email to support@platform.com.
Before installing Platform LSF Version 7, you must get a demo license key.
Contact license@platform.com to get a demo license.
Put the demo license file license.dat in the same directory where you downloaded the Platform LSF product distribution tar files.
Use the lsfinstall installation program to install a new LSF Version 7 cluster, or upgrade from and earlier LSF version.
See Installing Platform LSF on UNIX and Linux for new cluster installation steps.
See the Platform LSF Command Reference for detailed information about lsfinstall and its options.
DO NOT use the UNIX and Linux upgrade steps to migrate an existing LSF 7 cluster or LSF 7 Update 1 cluster to LSF 7 Update 4. Follow the manual steps in the document Migrating to Platform LSF Version 7 Update 4 on UNIX and Linux to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 4 on UNIX and Linux.
Platform LSF on Windows 2000, Windows 2003, and Windows XP is distributed in the following packages:
See Installing Platform LSF on Windows for new cluster installation steps.
To migrate your existing LSF Version 7 cluster on Windows to LSF 7 Update 4, you must follow the manual steps in the document Migrating Platform LSF Version 7 to Update 4 on Windows (lsf_migrate_windows_to_update4.pdf).
See Using Platform LSF License Scheduler for installation and configuration steps.
Information about Platform LSF Version 7 is available in the LSF area of the Platform FTP site (ftp.platform.com/distrib/7.0/).
The latest information about all supported releases of Platform LSF is available on the Platform Web site at www.platform.com.
If you have problems accessing the Platform web site or the Platform FTP site, send email to support@platform.com.
my.platform.com—Your one-stop-shop for information, forums, e-support, documentation and release information. my.platform.com provides a single source of information and access to new products and releases from Platform Computing.
On the Platform LSF Family product page of my.platform.com, you can download software, patches, updates and documentation. See what’s new in Platform LSF Version 7, check the system requirements for Platform LSF, or browse and search the latest documentation updates through the Platform LSF Knowledge Center.
The Platform LSF Knowledge Center is your entry point for all LSF documentation. If you have installed the Platform Management Console, access and search the Platform LSF documentation through the link to the Platform Knowledge Center.
Get the latest LSF documentation from my.platform.com. Extract the LSF documentation distribution file to the directory LSF_TOP/docs/lsf.
The Platform EGO Knowledge Center is your entry point for Platform EGO documentation. It is installed when you install LSF. To access and search the EGO documentation, browse the file LSF_TOP/docs/ego/1.2.3/index.html.
If you have installed the Platform Management Console, access the EGO documentation through the link to the Platform Knowledge Center.
Platform’s Professional Services training courses can help you gain the skills necessary to effectively install, configure and manage your Platform products. Courses are available for both new and experienced users and administrators at our corporate headquarters and Platform locations worldwide.
Customized on-site course delivery is also available.
Find out more about Platform Training at www.platform.com/services/training, or contact Training@platform.com for details.
Contact Platform Computing or your LSF vendor for technical support. Use one of the following to contact Platform technical support:
When contacting Platform, please include the full name of your company.
See the Platform Web site at www.platform.com/company/contact-us for other contact information.
To get periodic patch update information, critical bug notification, and general support notification from Platform Support, contact supportnotice?request@platform.com with the subject line containing the word "subscribe".
To get security related issue notification from Platform Support, contact securenotice?request@platform.com with the subject line containing the word "subscribe".
© 1994-2008, Platform Computing Inc.
Although the information in this document has been carefully reviewed, Platform Computing Inc. (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole.
You may only redistribute this document internally within your organization (for example, on an intranet) provided that you continue to check the Platform Web site for updates and update your version of the documentation. You may not make it available to your organization over the Internet.
LSF is a registered trademark of Platform Computing Corporation in the United States and in other jurisdictions.
POWERING HIGH PERFORMANCE, PLATFORM COMPUTING, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, and the PLATFORM and PLATFORM LSF logos are trademarks of Platform Computing Corporation in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Macrovision, Globetrotter, and FLEXlm are registered trademarks or trademarks of Macrovision Corporation in the United States of America and/or other countries.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Intel, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.