Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_UPDATE_SUMMARY_INTERVAL = 1
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_UPDATE_SUMMARY_BY_TASK = 0
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_REQUEUE_LIMIT = 1
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_RETRY_LIMIT = 1
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_MAX_TASKS = 10
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_MAX_RUNLIMIT = 600
Nov 20 14:35:01 2008 4546 6 7.05 SSCHED_ACCT_DIR = /home/user1/ssched
Nov 20 14:35:03 2008 4546 6 7.05 Task <1[1]> submitted. Command <sleep 0>;
Nov 20 14:35:03 2008 4546 6 7.05 Task <1[2]> submitted. Command <sleep 0>;
Nov 20 14:35:03 2008 4546 6 7.05 Task <1[3]> submitted. Command <sleep 0>;
Nov 20 14:35:03 2008 4546 6 7.05 Task <1[4]> submitted. Command <sleep 0>;
Nov 20 14:35:03 2008 4546 6 7.05 Task <1[5]> submitted. Command <sleep 0>;
Nov 20 14:35:05 2008 4546 6 7.05 Task <1[1]> done successfully. The CPU time used is 0.030993 seconds;
Nov 20 14:35:05 2008 4546 6 7.05 Task <1[2]> done successfully. The CPU time used is 0.039992 seconds;
Nov 20 14:35:05 2008 4546 6 7.05 Task <1[3]> done successfully. The CPU time used is 0.033993 seconds;
Nov 20 14:35:05 2008 4546 6 7.05 Task <1[4]> done successfully. The CPU time used is 0.026994 seconds;
Nov 20 14:35:05 2008 4546 6 7.05 Task <1[5]> done successfully. The CPU time used is 0.036992 seconds;
Task Summary
Submitted: 5
Done: 5
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_UPDATE_SUMMARY_INTERVAL = 1
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_UPDATE_SUMMARY_BY_TASK = 0
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_REQUEUE_LIMIT = 1
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_RETRY_LIMIT = 1
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_MAX_TASKS = 10
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_MAX_RUNLIMIT = 600
Nov 20 14:35:47 2008 4748 6 7.05 SSCHED_ACCT_DIR = /home/user1/ssched
Nov 20 14:35:49 2008 4748 6 7.05 Task <1> submitted. Command <exit 1>;
Nov 20 14:35:50 2008 4748 6 7.05 Task <1> exited with status 1.
Nov 20 14:35:50 2008 4748 6 7.05 Task <1> submitted. Command <exit 1>;
Nov 20 14:35:50 2008 4748 6 7.05 Task <1> exited with status 1.
Task Summary
Submitted: 1
Requeued: 1
Done: 0
Exited: 2
Execution Errors: 2
Dispatch Errors: 0
Other Errors: 0
Task Error Summary
Execution Error
Task ID: 1
Submit Time: Tue Nov 20 14:35:49 2008
Start Time: Tue Nov 20 14:35:50 2008
End Time: Tue Nov 20 14:35:50 2008
Exit Code: 1
Exit Reason: Normal exit
Exec Hosts: ibm03
Exec Home: /home/user1
Exec Dir: /home/user1/src/lsf7ss/ssched/ssched
Command: exit 1
Action: Requeue exit value match; task will be requeued
Execution Error
Task ID: 1
Submit Time: Tue Nov 20 14:35:50 2008
Start Time: Tue Nov 20 14:35:50 2008
End Time: Tue Nov 20 14:35:50 2008
Exit Code: 1
Exit Reason: Normal exit
Exec Hosts: ibm03
Exec Home: /home/user1
Exec Dir: /home/user1/src/lsf7ss/ssched/ssched
Command: exit 1
Action: Task requeue limit reached; task will not be requeued
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_UPDATE_SUMMARY_INTERVAL = 1
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_UPDATE_SUMMARY_BY_TASK = 0
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_REQUEUE_LIMIT = 1
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_RETRY_LIMIT = 1
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_MAX_TASKS = 10
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_MAX_RUNLIMIT = 600
Nov 20 14:37:04 2008 5049 6 7.05 SSCHED_ACCT_DIR = /home/user1/ssched
Nov 20 14:37:06 2008 5049 6 7.05 Task <1> submitted. Command <sleep 0>;
Nov 20 14:37:08 2008 5049 6 7.05 Task <1> had a dispatch error.
Nov 20 14:37:08 2008 5049 6 7.05 Task <1> submitted. Command <sleep 0>;
Nov 20 14:37:08 2008 5049 6 7.05 Task <1> had a dispatch error.
Task Summary
Submitted: 1
Done: 0
Exited: 1
Execution Errors: 0
Dispatch Errors: 1
Other Errors: 0
Task Error Summary
Dispatch Error
Task ID: 1
Submit Time: Tue Nov 20 14:37:06 2008
Failure Reason: Pre-execution command failed
Command: sleep 0
Pre-Exec: exit 1
Start time: Tue Nov 20 14:37:07 2008
Execution host: ibm03
Action: Task will be retried
Dispatch Error
Task ID: 1
Submit Time: Tue Nov 20 14:37:08 2008
Failure Reason: Pre-execution command failed
Command: sleep 0
Pre-Exec: exit 1
Start time: Tue Nov 20 14:37:08 2008
Execution host: ibm03
Action: Task retry limit reached; task will not be retried
After the tasks have been submitted to the Session Scheduler and started, users can enable additional debugging by Session Scheduler components by sending a SIGUSR1 signal.
The additional log messages are sent to stderr.
The debug messages are saved to a file in /tmp/ssched/. You are responsible for deleting this file when it is no longer needed.
The Session Scheduler caches host info from LIM. If the host factor of a host is changed after the Session Scheduler starts, the Session Scheduler will not see the updated host factor. The host factor is used in the task accounting log.
Session Scheduler does not support per task memory or swap utilization tracking from ssacct. Run bacct to see aggregate memory and swap utilization.
When specifying a multiline command line as a ssched command line parameter, you must enclose the command in quotes. A multiline command line is any command containing a semi-colon (;). For example:
When specifying a multiline command line as a parameter in a task definition file, you must NOT use quotes. For example:
If you submit a shell script containing multiple ssched commands, bjobs -l only shows the task summary for the currently running ssched instance. Enable task accounting and examine the accounting file to see information for tasks from all ssched instances in the shell script.
Submitting a large number of tasks as part of one session may cause a slight delay between when the Session Scheduler starts and when tasks are dispatched to execution agents. The Session Scheduler must parse and submit each task before it begins dispatching any tasks. Parsing 50,000 tasks can take up to 2 minutes before dispatching starts.
After all tasks have completed, the Session Scheduler will take some time to terminate all execution agents and to clean up temporary files. A minimum of 20 seconds is normal, longer for larger allocations.
Session Scheduler handles the following signals: SIGINT, SIGTERM, SIGUSR1, SIGSTOP, SIGTSTP, and SIGCONT. All other signals cause ssched to exit immediately. No summary is output and task accounting information is not saved. The signals Session Scheduler handles will be expanded in future releases.