Release Date: October 2008
The following bugs have been fixed in the October 2008 update (LSF 7 Update 4) since the May 2008 update (LSF 7 Update 3):
112864 |
Date |
2008-09-24 |
|
Description |
LIM cannot detect correct CPU and core number on non virtual machine if there is /proc/xen |
|
Component |
lim |
|
Platform |
linux |
|
Impact |
LIM gets wrong CPU and core number. |
109205 |
Date |
2008-09-23 |
|
Description |
eexec is not run after interactive job finishes |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
eexec does not run |
107031 |
Date |
2008-09-23 |
|
Description |
SLA does not work with fairshare |
|
Component |
schmod_fairshare.so mbschd |
|
Platform |
All |
|
Impact |
SLA goal is not reached |
109779 |
Date |
2008-09-16 |
|
Description |
lsload / lsplace -R "status==ok" does not show correct RES status properly when RES is down. |
|
Component |
Lim |
|
Platform |
All |
|
Impact |
lsload / lsplace does not show RES status properly. Users cannot pick hosts with RES daemons running. |
109128 |
Date |
2008-09-16 |
|
Description |
No email sent for job idle exceptions |
|
Component |
mbatchd |
|
Platform |
Windows |
|
Impact |
LSF administrator is not notified about idle jobs |
112643 |
Date |
2008-09-11 |
|
Description |
Jobs are pending with reason "New job is waiting for scheduling" |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Job will pend forever till system clock is reset. In a system where ntpd is enabled to sync up time, or other mechanisms which will roll back the system clock, some jobs will stay in pending status forever with reason "New job is waiting for scheduling;". The problem lies in synchronization between mbatchd and mbschd. Run badmin reconfig as workaround. |
109183 |
Date |
2008-09-01 |
|
Description |
PAM and TS have exited, but LSF still reports the job as RUN |
|
Component |
pam |
|
Platform |
linux unix |
|
Impact |
Resources are occupied by the unfinished jobs |
108900 |
Date |
2008-08-29 |
|
Description |
After upgrade, mbatchd restart batch commands do not respond for a long time |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
Admin is forced to remove the lsb.events file and restart mbatchd again to get the batch system working |
106777 |
Date |
2008-08-28 |
|
Description |
When submitting a job with a project name which contains spaces, bacct -P cannot recognize the job |
|
Component |
Bacct |
|
Platform |
All |
|
Impact |
Cannot use bacct to check project name string which has spaces. |
105490 |
Date |
2008-08-28 |
|
Description |
When customer defines "default" limit, the value applies to both user and user group. Customer wants a method to apply the limit only for user, not user group. |
|
Component |
mbschd mbatchd |
|
Platform |
All |
|
Impact |
Administrators have to configure user group limits one by one when they want to only apply "default" limit to user, not user group. |
113378 |
Date |
2008-08-27 |
|
Description |
When RESOURCE_RESERVE_PER_SLOT is defined in lsb.params, a host-specific resource reported by elim does not report the correct value for the resource once a parallel job (that has reserved this resource) has started. |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
Resource reservation value is wrong |
112514 |
Date |
2008-08-27 |
|
Description |
bhist prints out same job event status two times |
|
Component |
bhist |
|
Platform |
All |
|
Impact |
Misleading bhist output |
112256 |
Date |
2008-08-27 |
|
Description |
High priority job does not preempt jobs with less slots |
|
Component |
schmod_preemption.so |
|
Platform |
All |
|
Impact |
OPTIMAL_MINI_JOB preemption policy does not work |
113844 |
Date |
2008-08-24 |
|
Description |
log_jobdata(): Job failed in getHostFactor() message appears in execution cluster mbatchd log file, even though jobs can run successfully. |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
Misleading mbatchd log message |
113299 |
Date |
2008-08-21 |
|
Description |
To modify user-specified requested processor through an esub, both LSB_SUB_NUM_PROCESSORS and LSB_SUB_MAX_NUM_PROCESSORS need to be specified and they need to be written to LSB_SUB_MODIFY_FILE in a certain order. |
|
Component |
bsub |
|
Platform |
All |
|
Impact |
This behavior can cause esub to not make a modification that it should. |
111507 |
Date |
2008-08-21 |
|
Description |
NFS (root=nobody) MC-Lease mode purging lsb.lease.state file fails, lsb.lease.state.tmp left behind |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
NFS (root=nobody) MC-Lease purging lsb.lease.state file fails, lsb.lease.state.tmp left behind |
112296 |
Date |
2008-08-20 |
|
Description |
When mbatchd restarts with bad events, lsb.events.0 is missing |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
History information for some jobs is lost. |
112285 |
Date |
2008-08-19 |
|
Description |
If duplicate event logging is configured, when LSF_SHAREDIR goes down and comes back up, the child mbatchd, which writes to LSF_SHAREDIR, does not respond for 15 minutes and then dies. |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
Data duplication is delayed for 15 minutes. The parent mbatchd dies, and a new one starts which adds further to the delay. |
113707 |
Date |
2008-08-15 |
|
Description |
MPI job gets different result by using LSF and out of LSF. |
|
Component |
intelmpi_wrapper mpich2_wrapper |
|
Platform |
Intelmpi_wrapper(all linux except cray) mpich2_wrapper(all except slurm) |
|
Impact |
MPI local option is set to global option. MPI Program may run with wrong arguments. |
113264 |
Date |
2008-08-14 |
|
Description |
Mandatory first execution host does not work at queue level with "RES_REQ=span |
|
Component |
schmod_parallel.so |
|
Platform |
All |
|
Impact |
Mandatory first execution host does not work. |
108727 |
Date |
2008-08-14 |
|
Description |
bjobs -G does not work as documented – behaves like bjobs -u |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
bjobs -G does not work as documented |
110814 |
Date |
2008-08-13 |
|
Description |
SGI-MPI (vendor-MPI) mpirun options are not recognized by pam |
|
Component |
Pam |
|
Platform |
All |
|
Impact |
Cannot use SGI-MPI mpirun options on pam command line when using pam SGI-MPI integration |
111728 |
Date |
2008-08-07 |
|
Description |
First execution node is not the same after migrating a parallel job in MultiCluster environment. |
|
Component |
Mbatchd |
|
Platform |
All |
|
Impact |
All jobs fail after migration |
112053 |
Date |
2008-08-06 |
|
Description |
LDAP authentication for PMC not supported. Users cannot log in to PMC with local Linux account. |
|
Component |
PMC |
|
Platform |
All |
|
Impact |
Customers who use LDAP instead of NIS cannot log in to PMC with local Linux account. |
97887 |
Date |
2008-08-04 |
|
Description |
Jobs with high LSF version will get lost if master and master candidate do not have the same version |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Jobs get lost. |
111673 |
Date |
2008-08-04 |
|
Description |
bladmin reconfig fails on x86_64 platform |
|
Component |
bladmin blhosts |
|
Platform |
Unix |
|
Impact |
Cannot use bladmin reconfig to restart License Scheduler |
110694 |
Date |
2008-07-29 |
|
Description |
Cross-queue fairshare scheduling does not work when two slave queues belonging to two cross-queue sets have the same priority |
|
Component |
mbschd |
|
Platform |
All |
|
Impact |
Cross-queue fairshare scheduling does not work |
106508 |
Date |
2008-07-29 |
|
Description |
Users get incorrect emails about license overuse and number of available license counts |
|
Component |
Lim |
|
Platform |
All |
|
Impact |
Potential overuse of license resources |
111150 |
Date |
2008-07-27 |
|
Description |
Job pends with reason "Job's resource requirements not satisfied" |
|
Component |
schmod_mc.so |
|
Platform |
All |
|
Impact |
Cannot submit job from with resource requirement specified |
111717 |
Date |
2008-07-25 |
|
Description |
badmin reconfig dispatches more jobs than available licenses |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
badmin reconfig dispatches more jobs than available licenses |
111994 |
Date |
2008-07-24 |
|
Description |
Multiple clusters exist on a group of hosts so users can change environment from one cluster to another then back again. profile.lsf or/and cshrc.lsf do not set the environment correctly. The PATH environment variable is set to LSF_BINDIR of the wrong cluster. |
|
Component |
install |
|
Platform |
All |
|
Impact |
profile.lsf and cshrc.lsf set the wrong LSF environment. |
111672 |
Date |
2008-07-24 |
|
Description |
Host does not have a software license |
|
Component |
lim |
|
Platform |
All |
|
Impact |
Cluster cannot work without restarting the LIM |
111838 |
Date |
2008-07-23 |
|
Description |
Deadline constraint policy violated under specific run-window configurations |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Deadline constraint policy does not work. Jobs run when they should not. |
110981 |
Date |
2008-07-23 |
|
Description |
Add submission and execution cluster name in email notification |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
Email notification incomplete |
109587 |
Date |
2008-07-20 |
|
Description |
Job pends forever after mbatchd restart with 'Dependency not statisfied' |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Job pends forever after mbatchd restart with 'Dependency not statisfied' |
111569 |
Date |
2008-07-17 |
|
Description |
blimits -w cannot show full length of SLOT MEM TMP SWAP |
|
Component |
blimits |
|
Platform |
All |
|
Impact |
blimits -w cannot show full length of SLOT MEM TMP SWAP |
106541 |
Date |
2008-07-17 |
|
Description |
Cannot query jobs using bhist by specifying user group |
|
Component |
bhist |
|
Platform |
All |
|
Impact |
bhist output incomplete |
110799 |
Date |
2008-07-15 |
|
Description |
bsub ignores resource requirement string in the SchedulerParams section of jsdl - xml specification |
|
Component |
bsub |
|
Platform |
All |
|
Impact |
bsub ignores resource requirement string in the SchedulerParams section of jsdl - xml specification |
110964 |
Date |
2008-07-13 |
|
Description |
Need improved LSF environment setup |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
LSB_EXEC_CLUSTER and LSB_SUB_CLUSTER environment variables should be available for job preparation. However, these variables are not available if the job executes on the submission cluster. LSB_EXEC_CLUSTER variable should be available for all jobs. |
89242 |
Date |
2008-07-10 |
|
Description |
Duplicate emails are received for same host exceptions |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Duplicate emails are received for same host exceptions |
111051 |
Date |
2008-07-09 |
|
Description |
lsb.stream file permission is set to 600 after running bmod |
|
Component |
utopia/lsbatch/lib/liblsbstream.so |
|
Platform |
linux2.6-glibc2.3-x86_64 |
|
Impact |
Users cannot read lsb.stream file |
110938 |
Date |
2008-07-08 |
|
Description |
PERF stops writing to the LSB_EVENTS table if charged SAAP field is more than 64 chars |
|
Component |
PERF |
|
Platform |
All |
|
Impact |
Event data loading stopped |
110227 |
Date |
2008-07-06 |
|
Description |
lsmake gets error message if LSF daemon ports are defined in /etc/services instead of lsf.conf |
|
Component |
res |
|
Platform |
All |
|
Impact |
Cannot use the right res port. |
109659 |
Date |
2008-07-06 |
|
Description |
Error messages about long project names are inconsistent |
|
Component |
bacct blimits bmod bhist bjobs mbatchd bsub |
|
Platform |
All |
|
Impact |
Confusing error messages |
110576 |
Date |
2008-07-04 |
|
Description |
LD_LIBRARY_PATH gets reappended if user submits a job with job command environment |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
User scripts fail |
110322 |
Date |
2008-06-30 |
|
Description |
Incorrect error messages are logged in mbatchd log |
|
Component |
bld |
|
Platform |
unix |
|
Impact |
Incorrect messages are visible in logs |
109930 |
Date |
2008-06-30 |
|
Description |
Job get dispatched to wrong host |
|
Component |
mbschd |
|
Platform |
All |
|
Impact |
Jobs fail because of incorrect execution host |
108459 |
Date |
2008-06-27 |
|
Description |
CONTROL ACTION is invoked even though SLA has been met |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Incorrect job control action is invoked |
107678 |
Date |
2008-06-26 |
|
Description |
eexec runs as an unexpected user group if LSF_EEXEC_USER is set |
|
Component |
sbatchd |
|
Platform |
All |
|
Impact |
eexec cannot run |
106413 |
Date |
2008-06-24 |
|
Description |
Customer is using Solutions#89820 (Enhance bjobs/LSF batch API to just fetch summary information of jobs). When using this fix, bjobs fails with "xdr encode/decode error" if 100s of job IDs are specified at the same time for bjobs. |
|
Component |
bjobs mbatchd |
|
Platform |
All |
|
Impact |
user scripts that use a lot of job IDs in their bjobs query fail. |
110461 |
Date |
2008-06-23 |
|
Description |
badmin hclose/hopen and lsadmin reslogon/reslogoff fails on slave host |
|
Component |
lsadmin badmin |
|
Platform |
All |
|
Impact |
badmin hclose/hopen and lsadmin reslogon/reslogoff fails on slave host |
110026 |
Date |
2008-06-19 |
|
Description |
MBD never switches out LOG_SWITCH event |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
MBD never switches out LOG_SWITCH event |
87626 |
Date |
2008-06-13 |
|
Description |
Totalview integration requires sleep() to debug application |
|
Component |
|
|
Platform |
All |
|
Impact |
Totalview integration should work around this problem. |
107903 |
Date |
2008-06-10 |
|
Description |
If cpuset destroy API fails after rla restart, an error of status file corruption is reported in rla log |
|
Component |
rla |
|
Platform |
linux2.4-glibc2.2-sn-ipf linux2.4-glibc2.3-sn-ipf linux2.6-glibc2.3-sn-ipf linux2.6-glibc2.4-sn-ipf |
|
Impact |
rla cannot clean up left over cpusets |
108199 |
Date |
2008-05-27 |
|
Description |
Job fails even though its tasks are successfully finished. pam waits for none-existent tasks then kills the job. |
|
Component |
pam |
|
Platform |
All |
|
Impact |
Job fails |
106832 |
Date |
2008-05-16 |
|
Description |
New jobs are not queued at the bottom |
|
Component |
mbatchd |
|
Platform |
All |
|
Impact |
Dispatch order are not correct |
105599 |
Date |
2008-05-15 |
|
Description |
When LSF_EAUTH_KEY is configured in /etc/lsf.sudoers jobs submitted through the Windows clients fail with error “C:\Documents and Settings\user1 >bsub -R "type==any" dir User permission denied. Job not submitted.” |
|
Component |
|
|
Platform |
All |
|
Impact |
Cannot submit job through Windows client with LSF_EAUTH_KEY set in /etc/lsf.sudoers. |
support@platform.com
www.platform.com
North America: +1 905 948 4297
Europe: +44 1256 370 530
Asia: +86 10 6238 1125
Toll-free: 1-877-444-4573
Platform Support
Platform Computing Corporation
3760 14th Avenue
Markham, Ontario
Canada L3R 3T7
© 1994 - 2008 Platform Computing Corporation
All Rights Reserved.
Although the information in this document has been carefully reviewed, Platform Computing Corporation (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
Document redistribution policy : This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole. You may only redistribute this document internally within your organization (for example, on an intranet).
LSF is a registered trademark of Platform Computing Corporation in the United States and in other jurisdictions.
ACCELERATING INTELLIGENCE, THE BOTTOM LINE IN DISTRIBUTED COMPUTING, PLATFORM COMPUTING, CLUSTERWARE, PLATFORM ACTIVECLUSTER, IT INTELLIGENCE, SITEASSURE, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, PLATFORM INTELLIGENCE, PLATFORM INFRASTRUCTURE INSIGHT, PLATFORM WORKLOAD INSIGHT, and the PLATFORM and LSF logos are trademarks of Platform Computing Corporation in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.