For additional information about Platform LSF Version 7 Update 5, visit the Platform Computing Web site:
http://www.platform.com/Products/platform-lsf/features-benefits
DO NOT use the UNIX and Linux upgrade steps to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 5. Follow the manual steps in the document Migrating to Platform LSF Version 7 Update 5 on UNIX and Linux to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 5 on UNIX and Linux.
Visit the Platform Computing Web site for information about supported operating systems and system requirements for Platform LSF:
http://www.platform.com/Products/platform-lsf/technical-information
Applications need to be rebuilt if they use APIs that have changed in LSF Version 7 Update 5.
To take full advantage of new Platform LSF Version 7 features, you should recompile your existing LSF applications with LSF Version 7.
LSF 7 Update 5 added new host management functionality with the introduction of compute units.
Compute units are similar to host groups, with the added feature of granularity allowing the construction of cluster-wide structures that mimic network architecture. Job scheduling using compute unit resource requirements optimizes job placement based on the underlying system architecture, minimizing communications bottlenecks. Compute units are especially useful when running extensive parallel jobs. However, using compute units to optimize job placement means LSF needs more scheduling time. The result is a longer time to allocation.
Resource requirement strings can specify compute units requirements such as running a job exclusively, spreading a job evenly over multiple compute units, setting the number of slots required from each compute unit, and setting the maximum number of compute units used by a job. Compute units then replace hosts as the basic unit of allocation for a job.
Individual hosts configured as compute units apply the new compute unit functionality at the host level.
Some limitations apply to the use of compute units:
Compute unit exclusive jobs (cu[excl]) cannot preempt other jobs or be preempted by other jobs.
Compute units were introduced in LSF Version 7 Update 5 and are not compatible with earlier versions of LSF. Affected features:
Hosts from HPC system integrations cannot be allocated to jobs with compute unit requirements. Affected integrations include:
Compute unit requirements cannot be used with compound resource requirement strings.
Advance reservations will not always be effective for compute unit exclusive jobs running on compute units split by an advance reservation. If hosts outside of the reservation start running a compute unit exclusive job, the hosts inside the advance reservation will also be locked. Ideally all hosts belonging to the same compute unit should be inside or outside an advanced reservation.
Compound resource requirements allow you to specify different requirements for some slots within a job, either at the queue-level, application-level, or job-level. bmod -R also accepts compound resource requirement strings for both pending and running jobs.
Special rules take effect when compound resource requirements are merged with resource requirements defined at more than one level. If a compound resource requirement is used at any level (job, application, or queue) the compound multi-level resource requirement merge rules apply.
Some limitations apply to the use of compound resource requirements:
Multiple -R strings and rusage strings containing the or operator (||) are not supported by compound resource requirements.
Resource allocation for parallel jobs using compound resources is done for each compound resource term in the order listed instead of considering all possible combinations. A host rejected for not satisfying one resource requirement term will not be reconsidered for subsequent resource requirement terms.
Windows Terminal Services jobs cannot have compound resource requirements.
Optimized preemption for parallel jobs (using the PREEMPT_FOR parameter in lsb.params) is not supported.
Compound resource requirements were introduced in LSF Version 7 Update 5, and are not compatible with earlier versions of LSF. Affected features:
Hosts from HPC system integrations cannot be allocated to compound resource requirement jobs. Affected integrations include:
The following commands do not support compound resource requirements:
The dynamic priority formula used to determine user priority in fairshare job scheduling has an added fairshare adjustment term and factor, allowing customization of dynamically calculated user shares. The adjustment term can include memory usage by running jobs, as well as the data already used by the dynamic priority formula.
The open source fairshare adjustment code can be altered in the file libfairshareadjust.* and is enabled through setting the parameter FAIRSHARE_ADJUSTMENT_FACTOR in lsb.params to a positive value.
The new command bjdepinfo allows you to display all or selected job dependencies. You can get a list of other jobs that a job depends on (parent jobs) or jobs that depend on your job (child jobs).
The new command lsadmin lsflic displays LSF (internal to LSF) license usage. Options include showing all features, specified features, all host class levels in the cluster, and license substitution.
LSF marks new hosts as licensed initially, then confirms license assignments during periodic license management processing. Output from lsadmin lsflic before license assignments are confirmed may show additional licenses in use.
The enhanced PMC now has a Host Dashboard with detailed host information, including options to filter and sort hosts.
The PMC from LSF Version 7 Update 4 can be upgraded to LSF 7 Update 5 alongside the cluster following the steps given in Upgrading Platform LSF on UNIX and Linux (lsf_upgrade_unix.pdf).
Jobs submitted with bsub -G can have the limits of only the specified user group enforced. Enhanced user group limit enforcement is enabled by the parameter ENFORCE_ONE_UG_LIMITS in lsb.params. When not enabled the strictest limits (of the user groups that the user is a member of) are applied to the job.
The linux2.6-glibc2.3-x86_64 package now provides cpuset and SGI MPI integrations.
The command blaunch can now be used on Windows 2000 or later hosts to launch parallel job, with some limitations:
Job exception events are now logged to lsb.events and lsb.streams for jobs in which the runtime estimate is exceeded. The new exception appears in output from bjobs and bhist.
Enhanced bjobs output now includes a summary of Session Scheduler jobs and tasks, a new job exception indicating when a job’s runtime estimate has been exceeded, and the Share Attribute Account Path (SAAP) for fairshare scheduling.
A configurable parameter in lsf.conf (LSF_LICENSE_MAINTENANCE_INTERVAL) allows you to set longer license checking intervals, saving time during cluster startup and restart. By delaying licensing maintenance until after startup mlim communicates with hosts efficiently and new hosts are added quickly.
Both taskman and LSF batch jobs using licenses managed by License Scheduler now have a maximum preemption times setting. Jobs preempted the specified maximum number of times cannot be preempted again.
LSF 7 Update 5 added new functionality that makes it easier to specify a large number of hosts at one time (condensed notation) or to allow a job to run on an intersection of available hosts between a queue, advance reservation, and bsub -m.
Character limits have been increased to 511 characters for user group names in all configuration files and batch commands.
An external static LIM script enables LSF to automatically detect the operating systems types and versions and display them when running lshosts -l or lshosts -s. You can then specify those types reported in any -R resource requirement string. For example, bsub -R "select [ostype=RHEL4.6]".
The following configuration parameters and environment variables are new or changed for LSF Version 7 Update 5:
COMPUTE_UNIT_TYPES: Defines valid compute unit types for use in lsb.hosts and the compute unit resource requirement string (cu[]).
ENABLE_HOST_INTERSECTION: Allows a job to run on an intersection of available hosts between a queue, advance reservation, and bsub -m.
ENFORCE_ONE_UG_LIMITS: When enabled and the job submitted with -G option specifying a user group, enforces the limits for that one user group only even if the user belongs to more than one user group. If not enabled, the strictest limits (of the user groups that the user is a member of) are applied to the job.
FAIRSHARE_ADJUSTMENT_FACTOR: Weighting factor for the fairshare adjustment plugin libfairshareadjust.*. If not defined or set to a value of 0 or less, the fairshare adjustment has no impact on the dynamic priority formula used to calculate user priority for fairshare job scheduling.
LOG_RUNTIME_EST_EXCEEDED: Undocumented parameter enabling logging of the new job exception runtime_est_exceeded. Default value is Y. Not displayed in the bparams output.
MAX_JOB_PREEMPT: Now applies to LSF batch jobs using licenses managed by License Scheduler when enabled by LS_ENABLE_MAX_PREEMPT in lsf.licensescheduler.
EGO_ESLIM_TIMEOUT: Controls how long the LIM waits for any external static LIM scripts to run.
LSB_LOGON_INTERACTIVE: LSF parameter automatically set to Y on Windows Vista platforms; allow the correct users to submit jobs from Windows Vista hosts. This parameter is not documented.
LSF_ASPLUGIN: Specifies a path to the SGI Array Services library libarray.so. The parameter only takes effect on 64-bit x-86 Linux 2.6, glibc 2.3. The default path is /usr/lib64/libarray.so.
LSF_BMPLUGIN: Specifies a path to the bitmask library libbitmask.so. The parameter only takes effect on 64-bit x-86 Linux 2.6, glibc 2.3. The default path is /usr/lib64/libbitmask.so.
LSF_CPUSETLIB: Specifies a path to the SGI cpuset library libcpuset.so. The parameter only takes effect on 64-bit x-86 Linux 2.6, glibc 2.3. The default path is /usr/lib64/libcpuset.so.
LSF_LICENSE_MAINTENANCE_INTERVAL: Allows you to control how often LSF checks for licenses upon cluster start up or restart. By setting the number higher than the default of 5 (in seconds), you can significantly increase the speed at which the cluster starts up.
LSF_MONITOR_LICENSE_TOOL: Enables data collection by lim for the command option lsadmin lsflic.
LSF_VPLUGIN: On SGI Linux (64-bit x-86 Linux 2.6, glibc 2.3.) an example path:
LS_ENABLE_MAX_PREEMPT: Enables checking preemption times for taskman job based on the value of parameter LS_MAX_TASKMAN_PREEMPT in lsf.licensescheduler and MAX_JOB_PREEMPT in lsb.queues, lsb.applications, or lsb.params.
LS_MAX_TASKMAN_PREEMPT: Defines the maximum number of times taskman jobs can be preempted.
The option -l output now displays the new job exception runtime_est_exceeded under the heading EXCEPTION STATUS, when applicable.
When using fairshare scheduling the option -r now displays the fairshare adjustment plugin contribution to user dynamic priority under the heading ADJUST.
A new option (-ss) lists summary information on Session Scheduler jobs and tasks.
The -l option has been expanded to show when a job’s runtime estimate has been exceeded.
The -l option also now displays the Share Attribute Account Path (SAAP) for fairshare scheduling.
Now displays compound resource requirements, when applicable.
The command blaunch can now be used on Windows hosts to launch parallel job, although it has some limitations.
The LSF 6.x passwd.lsfuser password file is not compatible with LSF 7. In LSF 6.x, if a domain name is defined with LSF_USER_DOMAIN in lsf.conf, LSF only saves the user name to the password entry in the passwd.lsfuser password file. In LSF 7, the user name part of the password entry in the passwd.lsfuser file is a fully qualified user name (domain_name\user_name,), even if LSF_USER_DOMAIN is defined in lsf.conf.
Workaround: If your cluster defines LSF_USER_DOMAIN in lsf.conf, you must upgrade the entire 6.x cluster to LSF 7, and have all users run lspasswd to reenter their password.
Without this workaround, LSF 7 daemons cannot find the 6.x password entry and 6.x daemons cannot see the password saved on LSF 7 servers.
This problem affects all LSF versions before Version 7, LSF 6.0, 6.1, and 6.2.
If you want to use LSF Version 7 Update 5 on SUSE 11 with x86-64 processors, contact Platform Support for a patch.
Backfill jobs can overlap exclusive compute unit reservations. Free slots within an exclusive compute unit reservation appear available when using bslots to schedule backfill jobs. Job slots used by the exclusive compute unit job do not appear available beyond the reservation start time.
When specifying a domain name in any LSF configuration file, use all uppercase characters. For example: LSF/lsfadmin instead of lsf/lsfadmin. Configuration settings will not be applied if the domain is in lowercase characters.
Jobs submitted with CPUSET_TYPE=none are still considered CPUSET jobs, and do not support compound resource requirements. For example, the following job submission will not run:
When using ProPacks in a cluster with mixed host types, you must also specify "same[type]" in the resource requirement string or use %a to run applications on appropriate host types. Only setting the ProPack version number is not sufficient to identify the possible host types a job can run on.
If there are no PSET hosts in your cluster, the PSET plug in is not supported and should not be configured in lsb.modules.
When installing a cluster with ENABLE_HPC_CONFIG=Y, if you restart the sbatchd on a Linux 2.6-glibc2.3-x86_64 host without a CPUSET package, the following error message is logged: Cannot find CPUSET library in LSF_ASPLUGIN=/usr/lib64/libarray.so, using the default value /usr/lib64/libarray.so. This message means that you do not have a CPUSET package installed on that host.
When compiling an application with a Version 7 Update 5 library, specify the option -ldl.
If you enable ENFORCE_ONE_UG_LIMITS and you have a user group with the keyword all, the limits are enforced on all user groups, not just the one specified. A patch will be available soon. Contact Platform Support.
A Session Scheduler job suspended with bstop enters USSUP state and the job cannot be killed with bkill. The out-of-box TERMINATE_CONTROL=SIGINT configuration in Session Scheduler causes only SIGINT to be sent to the job from bkill. To be terminated, the job must receive the required SIGCONT, SIGINT, SIGTERM, and SIGKILL signals. You must run bresume to cause the job to receive the correct bkill signals.
When installing License Scheduler standalone, the installer removes EGO environment variables from cshrc.lsf and profile.lsf. Specify a different LSF_TOP from the LSF installation to install standalone License Scheduler.
In the resource plan, if you specify reclamation with a grace period, the grace period is ignored by LSF. All resources are reclaimed immediately.
LSF admin cannot start the PMC in EGO-decoupled mode. Since the PMC has already been started by root, the log files are owned by root. When the PMC is restarted by the LSF cluster administrator, admin does not own the existing log files resulting in the JAVA (tomcat) process stalling.
Integrating LDAP with LSF has some additional requirements:
If you did not set DERBY_DB_HOST in install.config, you can still enable the Derby database host after installation. See procedure that follows.
Access to the Platform FTP site is controlled by login name and password. If you cannot access the distribution files for download, send email to support@platform.com.
You must provide your Customer Support Number and register a user name and password on my.platform.com to download LSF.
To register at my.platform.com, click New User? and complete the registration form. If you do not know your Customer Support Number or cannot log in to my.platform.com, send email to support@platform.com.
Before installing Platform LSF Version 7, you must get a demo license key.
Contact license@platform.com to get a demo license.
Put the demo license file license.dat in the same directory where you downloaded the Platform LSF product distribution tar files.
Use the lsfinstall installation program to install a new LSF Version 7 cluster, or upgrade from and earlier LSF version.
See Installing Platform LSF on UNIX and Linux for new cluster installation steps.
See the Platform LSF Command Reference for detailed information about lsfinstall and its options.
DO NOT use the UNIX and Linux upgrade steps to migrate an existing LSF 7 cluster or LSF 7 Update 1 cluster to LSF 7 Update 5. Follow the manual steps in the document Migrating to Platform LSF Version 7 Update 5 on UNIX and Linux to migrate an existing LSF 7 Update 1 cluster to LSF 7 Update 5 on UNIX and Linux.
Platform LSF on Windows 2000, Windows 2003, and Windows XP is distributed in the following packages:
See Installing Platform LSF on Windows for new cluster installation steps.
To migrate your existing LSF Version 7 cluster on Windows to LSF 7 Update 5, you must follow the manual steps in the document Migrating Platform LSF Version 7 to Update 5 on Windows (lsf_migrate_windows_to_update5.pdf).
See Using Platform LSF License Scheduler for installation and configuration steps.
Information about Platform LSF Version 7 is available in the LSF area of the Platform FTP site (ftp.platform.com/distrib/7.0/).
The latest information about all supported releases of Platform LSF is available on the Platform Web site at www.platform.com.
If you have problems accessing the Platform web site or the Platform FTP site, send email to support@platform.com.
my.platform.com—Your one-stop-shop for information, forums, e-support, documentation and release information. my.platform.com provides a single source of information and access to new products and releases from Platform Computing.
On the Platform LSF Family product page of my.platform.com, you can download software, patches, updates and documentation. See what’s new in Platform LSF Version 7, check the system requirements for Platform LSF, or browse and search the latest documentation updates through the Platform LSF Knowledge Center.
The Platform LSF Knowledge Center is your entry point for all LSF documentation. If you have installed the Platform Management Console, access and search the Platform LSF documentation through the link to the Platform Knowledge Center.
Get the latest LSF documentation from my.platform.com. Extract the LSF documentation distribution file to the directory LSF_TOP/docs/lsf.
The Platform EGO Knowledge Center is your entry point for Platform EGO documentation. It is installed when you install LSF. To access and search the EGO documentation, browse the file LSF_TOP/docs/ego/1.2.3/index.html.
If you have installed the Platform Management Console, access the EGO documentation through the link to the Platform Knowledge Center.
Platform’s Professional Services training courses can help you gain the skills necessary to effectively install, configure and manage your Platform products. Courses are available for both new and experienced users and administrators at our corporate headquarters and Platform locations worldwide.
Customized on-site course delivery is also available.
Find out more about Platform Training at www.platform.com/services/training, or contact Training@platform.com for details.
Contact Platform Computing or your LSF vendor for technical support. Use one of the following to contact Platform technical support:
When contacting Platform, please include the full name of your company.
See the Platform Web site at www.platform.com/company/contact-us for other contact information.
To get periodic patch update information, critical bug notification, and general support notification from Platform Support, contact supportnotice‑request@platform.com with the subject line containing the word "subscribe".
To get security related issue notification from Platform Support, contact securenotice‑request@platform.com with the subject line containing the word "subscribe".
© 1994-2009, Platform Computing Inc.
Although the information in this document has been carefully reviewed, Platform Computing Inc. (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole.
You may only redistribute this document internally within your organization (for example, on an intranet) provided that you continue to check the Platform Web site for updates and update your version of the documentation. You may not make it available to your organization over the Internet.
LSF is a registered trademark of Platform Computing Corporation in the United States and in other jurisdictions.
POWERING HIGH PERFORMANCE, PLATFORM COMPUTING, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, and the PLATFORM and PLATFORM LSF logos are trademarks of Platform Computing Corporation in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Macrovision, Globetrotter, and FLEXlm are registered trademarks or trademarks of Macrovision Corporation in the United States of America and/or other countries.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Intel, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.