Learn more about Platform products at http://www.platform.com

[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]



MultiCluster Job Forwarding Model


This model was developed for the high-throughput computing environment.

In this section

[ Top ]


Overview of the Job Forwarding Model

In this model, the cluster that is starving for resources sends jobs over to the cluster that has resources to spare. Job status, pending reason, and resource usage are returned to the submission cluster. When the job is done, the exit code returns to the submission cluster.

[ Top ]


Job Scheduling Under the Job Forwarding Model

With this model, scheduling of MultiCluster jobs is a process with two scheduling phases: the submission cluster selects a suitable remote receive-jobs queue, and forwards the job to it; then the execution cluster selects a suitable host and dispatches the job to it. If a suitable host is not found immediately, the job remains pending in the execution cluster, and is evaluated again the next scheduling cycle.

This method automatically favors local hosts; a MultiCluster send-jobs queue always attempts to find a suitable local host before considering an receive-jobs queue in another cluster.

Phase I, local scheduling phase (all jobs)

  1. The send-jobs queue receives the job submission request from a user.
  2. The send-jobs queue parameters affect whether or not the job is accepted. For example, a job that requires 100 MB memory will be rejected if queue-level parameters specify a memory limit of only 50 MB.
  3. If the job is accepted, it becomes pending in the send-jobs queue with a job ID assigned by the submission cluster.
  4. During the next scheduling cycle, the send-jobs queue attempts to place the job on a host in the submission cluster. If a suitable host is found, the job is dispatched locally.
  5. If the job cannot be placed locally (local hosts may not satisfy its resource requirements, or all the local hosts could be busy), the send-jobs queue attempts to forward the job to another cluster.

Phase II, job forwarding phase (MultiCluster submission queues only)

  1. The send-jobs queue has a list of remote receive-jobs queues that it can forward jobs to. If a job cannot be placed locally, the send-jobs queue evaluates each receive-jobs queue. All queues that will accept more MultiCluster jobs are candidates. To find out how many additional MultiCluster jobs a queue can accept, subtract the number of MultiCluster jobs already pending in the queue from the queue's pending MultiCluster job threshold (IMPT_JOBBKLG). The order of preference is determined by the capacity; the first queue evaluated is the one that has room to accept the most new MultiCluster jobs.
  2. If information available to the submission cluster indicates that the first queue is suitable, LSF forwards the job to that queue.
  3. If the first queue is not suitable, LSF considers the next queue.
  4. If LSF cannot forward the job to any of the receive-jobs queues, the job remains pending in the send-jobs cluster and is evaluated again during the next scheduling cycle.

Phase III, remote scheduling phase (MultiCluster jobs only)

  1. The receive-jobs queue receives the MultiCluster job submission.
  2. The receive-jobs queue parameters affect whether or not the job is accepted. For example, a job that requires 100 MB memory will be rejected if queue-level parameters specify a memory limit of only 50 MB.
  3. If the job is rejected, it returns to the submission cluster.
  4. If the job is accepted, it becomes pending in the receive-jobs queue with a new job ID assigned by the execution cluster.
  5. During the next scheduling cycle, the receive-jobs queue attempts to place the job on a host in the execution cluster. If a suitable host is found, the job is dispatched. If a suitable host is not found, the job remains pending in the receive-jobs cluster, and is evaluated again the next scheduling cycle.
  6. If the job is dispatched to the execution host but cannot start, it returns to the submission cluster to be rescheduled. However, if the job repeatedly returns to the submission cluster because it could not be started in a remote cluster, LSF suspends the job (PSUSP) in the submission cluster.

[ Top ]


Queue Scheduling Parameters Under the Job Forwarding Model

Forcing consistent scheduling behavior

If the queue policies of the send-jobs queue are the same as the queue policies of the receive-jobs queue, the user should see identical behavior, whether the job is scheduled locally or remotely.

Queue policies differ

The job-level (user-specified) requirements and queue-level parameters (set by the administrator) are used to schedule and run the job.

If a job runs in the submission cluster, the send-jobs queue parameters apply. If a job becomes a MultiCluster job and runs in another cluster, the receive-jobs queue parameters apply.

Since the receive-jobs queue policies replace the send-jobs queue polices, LSF users might notice that identical jobs are subject to different scheduling policies, depending on whether or not the job becomes a MultiCluster job.

Send-jobs queue parameters that affect MultiCluster jobs

Receive-jobs queue parameters that affect MultiCluster jobs

In general, queue-level policies set on the execution side are the only parameters that affect MultiCluster jobs:

[ Top ]


Advance Reservations Across Clusters

Users can create and use advance reservation for the MultiCluster job forwarding model. To enable this feature, you must upgrade all clusters to LSF Version 7 or later.

Advance reservation

The user from the submission cluster negotiates an advance reservation with the administrator of the execution cluster. The administrator creates the reservation in the execution cluster.

The reservation information is visible from the submission cluster. To submit a job and use the reserved resources, users specify the reservation at the time of job submission.

A job that specifies a reservation can only start on the reserved resources during the time of the reservation, even if other resources are available. Therefore, this type of job does not follow the normal scheduling process. Instead, the job is immediately forwarded to the execution cluster and is held in PEND until it can start. These jobs are not affected by the remote timeout limit (MAX_RSCHED_TIME in lsb.queues) since the system cannot automatically reschedule to the job to any other cluster.

Missed reservations

If the execution cluster cannot accept the job because the reservation is expired or deleted, the job will be in the submission cluster in the PSUSP state.

The pending reason is:

Specified reservation has expired or has been deleted.

The job should be modified or killed by the owner.

If the execution cluster accepts the job and reservation expires or is deleted while job is pending, the job will be in the execution cluster in the PEND state.

Broken connections

If cluster connectivity is interrupted, all remote reservation is forgotten.

During this time, submission clusters will not be able to see remote reservations; jobs submitted with remote reservation and not yet forwarded will PEND; and new jobs will not be able to use the reservation. Reservation information will not be available until cluster connectivity is re-established and the clusters have a chance to synchronize on reservation. At that time (given that reservation is still available), jobs will be forwarded, new jobs can be submitted with specified reservation, and users will be able to see the remote reservation.

Modifying a reservation

After an advance reservation is made, you can use brsvmod modify the reservation.

Advance reservations only can be modified with brsvmod in the local cluster. A modified remote reservation is visible from the submission cluster. The jobs attached to the remote reservation are treated as the local jobs when the advance reservation is modified in the remote cluster.

Deleting a reservation

After an advance reservation is made, you can use brsvdel to delete the reservation from the execution cluster.

brsvdel reservation_ID

If you try to delete the reservation from the submission cluster, you will see an error.

Submitting a jobs to use a reservation in a remote cluster

Submit the job and specify the remote advance reservation as shown:

bsub -U reservation_name@cluster_name

In this example, we assume the default queue is configured to forward jobs to the remote cluster.

Extending a reservation

bmod -t allows the job to keep running after the reservation expires.

The command bmod does not apply to pending jobs or jobs that are already forwarded to the remote cluster. However it can be used on the execution cluster. For that, it behaves as if it is a local job.

[ Top ]


Special Considerations Under the Job Forwarding Model

Chunk jobs

Job chunking is done after a suitable host is found for the job. MultiCluster jobs can be chunked, but they are forwarded to the remote execution cluster one at a time, and chunked in the execution cluster. Therefore, the CHUNK_JOB_SIZE parameter in the submission queue is ignored by MultiCluster jobs that are forwarded to a remote cluster for execution.

If MultiCluster jobs are chunked, and one job in the chunk starts to run, both clusters display the WAIT status for the remaining jobs. However, the execution cluster behaves as if these jobs are in the PEND state, while the submission cluster behaves as if the jobs are in the RUN state. This affects the scheduling calculations for fairshare and limits.

Fairshare

If fairshare scheduling is enabled, resource usage information is a factor used in the calculation of dynamic user priority. MultiCluster jobs count towards a user's fairshare priority in the execution cluster, and do not affect fairshare calculations in the submission cluster.

There is no requirement that both clusters use fairshare or have the same fairshare policies. However, if you submit a job and specify a local user group for fairshare purposes (bsub -G), your job cannot run remotely unless you also belong to a user group of the same name in the execution cluster.

For more information on fairshare, see Administering Platform LSF.

Parallel jobs

A parallel job can be forwarded to another cluster, but the job cannot start unless the execution cluster has enough hosts and resources to run the entire job. A parallel job cannot span clusters.

Job requeue

If job requeue is enabled, LSF requeues jobs that finish with exit codes that indicate job failure.

For more information on job requeue, see Administering Platform LSF.

User-specified job requeue

brequeue in the submission cluster causes the job to be requeued in the send-jobs queue.

brequeue in the execution cluster causes the job to be requeued in the receive-jobs queue.

Automatic job requeue

  1. If job requeue (REQUEUE_EXIT_VALUES in lsb.queues) is enabled in the receive-jobs queue, and the job's exit code matches, the execution cluster requeues the job (it does not return to the submission cluster). Exclusive job requeue works properly.
  2. If the execution cluster does not requeue the job, the job returns to the send-jobs cluster, and gets a second chance to be requeued. If job requeue is enabled in the send-jobs queue, and the job's exit code matches, the submission cluster requeues the job.

    Exclusive job requeue values configured in the send-jobs queue always cause the job to be requeued, but for MultiCluster jobs the exclusive feature does not work; these jobs could be dispatched to the same remote execution host as before.

Automatic retry limits

The pre-execution command retry limit (MAX_PREEXEC_RETRY, LOCAL_MAX_PREEXEC_RETRY, and REMOTE_MAX_PREEXEC_RETRY), job requeue limit (MAX_JOB_REQUEUE), and job preemption retry limit (MAX_JOB_PREEMPT) configured in lsb.params, lsb.queues, and lsb.applications on the execution cluster are applied.

If the forwarded job requeue limit exceeds the limit on the execution cluster, the job exits and returns to the submission cluster and remains pending for rescheduling.

Job rerun

If job rerun is enabled, LSF automatically restarts running jobs that are interrupted due to failure of the execution host.

If queue-level job rerun (RERUNNABLE in lsb.queues) is enabled in both send- jobs and receive-jobs queues, only the receive-jobs queue reruns the job.

For more information on job rerun, see Administering Platform LSF.

  1. If job rerun is enabled in the receive-jobs queue, the execution cluster reruns the job. While the job is pending in the execution cluster, the job status is returned to the submission cluster.
  2. If the receive-jobs queue does not enable job rerun, the job returns to the submission cluster and gets a second chance to be rerun. If job rerun is enabled at the user level, or is enabled in the send-jobs queue, the submission cluster reruns the job.

Job migration

As long as a MultiCluster job is rerunnable (bsub -r or RERUNNABLE=yes in the send-jobs queue) and is not checkpointable, you can migrate it to another host, but you cannot specify which host. Migrated jobs return to the submission cluster to be dispatched with a new job ID.

For more information on job migration, see Administering Platform LSF.

User-specified job migration

To migrate a job manually, run bmig in either the submission or execution cluster, using the appropriate job ID. You cannot use bmig -m to specify a host. Operating in the execution cluster is more efficient than sending the bmig command through the submission cluster.

Automatic job migration

To enable automatic job migration, set the migration threshold (MIG in lsb.queues) in the receive-jobs queue. You can also set a migration threshold at the host level on the execution host (MIG in lsb.hosts). The lowest migration threshold applies to the job.

Automatic job migration configured in the send-jobs queue does not affect MultiCluster jobs.

Migration of checkpointable jobs

Checkpointable MultiCluster jobs cannot be migrated to another host. The migration action stops and checkpoints the job, then schedules the job on the same host again.

Checkpointing a MultiCluster job

Checkpointing of a MultiCluster job is only supported when the send-jobs queue is configured to forward jobs to a single remote receive-jobs queue, without ever using local hosts.

The checkpointable MultiCluster jobs resume on the same host.

For more information on checkpointing, see Administering Platform LSF.

Configuration

Checkpointing MultiCluster jobs

To enable checkpointing of MultiCluster jobs, define a checkpoint directory in both the send-jobs and receive-jobs queues (CHKPNT in lsb.queues), or in an application profile (CHKPNT_DIR, CHKPNT_PERIOD, CHKPNT_INITPERIOD, CHKPNT_METHOD in lsb.applications) of both submission cluster and execution cluster. LSF uses the directory specified in the execution cluster and ignores the directory specified in the submission cluster.

Checkpointing is not supported if a job runs on a leased host.

LSF writes the checkpoint file in a subdirectory named with the submission cluster name and submission cluster job ID. This allows LSF to checkpoint multiple jobs to the same checkpoint directory. For example, the submission cluster is ClusterA, the submission job ID is 789, and the send-jobs queue enables checkpointing. The job is forwarded to clusterB, the execution job ID is 123, and the receive-jobs queue specifies a checkpoint directory called XYZ_dir. LSF will save the checkpoint file in:

XYZ_dir/clusterA/789/

You cannot use bsub -k to make a MultiCluster job checkpointable.

Checkpointing a job

To checkpoint and stop a MultiCluster job, run bmig in the execution cluster and specify the local job ID. You cannot run bmig from the submission cluster. You cannot use bmig -m to specify a host.

Forcing a checkpointed job

Use brun to force any pending job to be dispatched immediately to a specific host, regardless of user limits and fairshare priorities. This is the only way to resume a checkpointed job on a different host. By default, these jobs attempt to restart from the last checkpoint.

Use brun -b if you want to make checkpointable jobs start over from the beginning (for example, this might be necessary if the new host does not have access to the old checkpoint directory).

Example

In this example, users in a remote cluster submit work to a data center using a send-jobs queue that is configured to forward jobs to only one receive-jobs queue. You are the administrator of the data center and you need to shut down a host for maintenance. The host is busy running checkpointable MultiCluster jobs.

Before you perform maintenance on a host in the execution cluster, take these steps:

  1. Run badmin hclose to close the host and prevent additional jobs from starting on the host.
  2. Run bmig and specify the execution cluster job IDs of the checkpointable MultiCluster jobs running on the host. For example, if jobs from a remote cluster use job IDs 123 and 456 in the local cluster, type the following command to checkpoint and stop the jobs:
    bmig 123 456 
    

    You cannot use bmig -m to specify a host.

  3. Allow the checkpoint process to complete. The jobs are requeued to the submission cluster. From there, they will be forwarded to the same receive-jobs queue again, and scheduled on the same host. However, if the host is closed, they will not start.
  4. Shut down LSF daemons on the host.

After you perform maintenance on a host, take these steps:

  1. Start LSF daemons on the host.
  2. Use badmin hopen to open the host. The MultiCluster jobs resume automatically.

Absolute priority scheduling

When absolute priority scheduling (APS) is enabled in the submission queue:

Strict resource requirement select string syntax

When LSF_STRICT_RESREQ=y is configured in lsf.conf, resource requirements are checked before jobs are forwarded to the remote cluster. If the selection string is valid, the job is forwarded.

When strict resource requirement checking configuration does not match between the submission and remote clusters, jobs may be rejected by the remote cluster.

Compute unit requirement strings

When a job is submitted with compute unit resource requirements, any requirements apply only to the execution cluster. Only the syntax of the resource requirement string is checked on the submission side, and if the cu[] string is valid, the job is forwarded.

When compute unit requirements cannot be satisfied in the remote cluster (such as a non-existent compute unit type) jobs may be rejected by the remote cluster. Hosts running LSF 7 Update 4 or earlier cannot satisfy compute unit resource requirements.

[ Top ]


Enabling MultiCluster Queues

By default, clusters do not share resources, even if MultiCluster has been installed. To enable job forwarding, enable MultiCluster queues in both the submission and execution clusters.

How it works

Send-jobs queue

A send-jobs queue can forward jobs to a specified remote queue. By default, LSF attempts to run jobs in the local cluster first. LSF only attempts to place a job remotely if it cannot place the job locally.

Receive-jobs queue

A receive-jobs queue accepts jobs from queues in a specified remote cluster. Although send-jobs queues only forward jobs to specific queues in the remote cluster, receive-jobs queues that accept jobs from a remote cluster accept work from any and all queues in that cluster.

Multiple queue pairs

Steps

To set up a pair of MultiCluster queues, do the following:

  1. In the submission cluster, configure a send-jobs queue that forwards work to the execution queue.
  2. In the execution cluster, configure a receive-jobs queue that accepts work from the cluster that contains the send-jobs queue.

Send-jobs queues

To configure a send-jobs queue, define SNDJOBS_TO in the lsb.queues queue definition. Specify a space-separated list of queue names in the format queue_name@cluster_name.

If the send-jobs queue has not got SNDJOBS_TO configured, it cannot forward MultiCluster jobs. The job remains pending in the submission cluster and is evaluated again during the next scheduling cycle.

Make sure the lsb.queues HOSTS parameter specifies only local hosts (or the special keyword none). If HOSTS specifies any remote hosts, SNDJOBS_TO is ignored, and the queue behaves as a receive-jobs queue under the resource leasing method.

Receive-jobs queues

To configure a receive-jobs queue, define RCVJOBS_FROM in the lsb.queues queue definition. Specify a space-separated list of cluster names.

Example

Begin Queue
QUEUE_NAME=queue1
SNDJOBS_TO=queue2@cluster2 queue3@cluster3
RCVJOBS_FROM=cluster2 cluster3
PRIORITY=30
NICE=20
End Queue

This queue is both a send-jobs and receive-jobs queue, and links with multiple remote clusters. If queue1 cannot place a job in the local cluster, it can forward the job to queue2 in cluster2, or to queue3 in cluster3. If any queues in clusters 2 or 3 are configured to send MultiCluster jobs to queue1, queue1 accepts them.

[ Top ]


Remote-Only Queues

By default, LSF tries to place jobs in the local cluster. If your local cluster is occupied, it may take a long time before your jobs can run locally. You might want to force some jobs to run on a remote cluster instead of the local cluster. Submit these jobs to a remote-only queue. A remote-only queue forwards all jobs to a remote cluster without attempting to schedule the job locally.

Configuring a remote-only queue

To make a queue that only runs jobs in remote clusters, take the following steps:

  1. Edit the lsb.queues queue definition for the send-jobs queue.
    • Define SNDJOBS_TO. This specifies that the queue can forward jobs to specified remote execution queues.
    • Set HOSTS to none. This specifies that the queue uses no local hosts.
  2. Edit the lsb.queues queue definition for each receive-jobs queue.
    • Define RCVJOBS_FROM. This specifies that the receive-jobs queue accepts jobs from the specified submission cluster.

Example

In cluster1:

Begin Queue
QUEUE_NAME = queue1
HOSTS = none
SNDJOBS_TO = queue2@cluster2
MAX_RSCHED_TIME = infinit
DESCRIPTION = A remote-only queue that sends jobs to cluster2.
End Queue

In cluster2:

Begin Queue
QUEUE_NAME = queue2
RCVJOBS_FROM = cluster1
DESCRIPTION = A queue that receives jobs from cluster1.
End Queue

Queue1 in cluster1 forwards all jobs to queue2 in cluster2.

Disabling timeout in remote-only queues

If you have a remote-only send-jobs queue that sends to only one receive-jobs queue, you should set MAX_RSCHED_TIME=infinit to maintain FCFS job order of MultiCluster jobs in the execution queue. Otherwise, jobs that time out are rescheduled to the same execution queue, but they lose priority and position because they are treated as a new job submission.

In general, the timeout is helpful because it allows LSF to automatically shift a pending MultiCluster job to a better queue.

Forcing a job to run in a remote cluster

You can use bsub -q and specify a remote-only MultiCluster queue if you want to prevent your job from running in the local cluster.

This is not compatible with bsub -m; when your job is forwarded to a remote queue, you cannot specify the execution host by name.

Example

queue1 is a remote-only MultiCluster queue.

bsub -q queue1 myjob
Job <101> is submitted to queue <queue1>.

This job will be dispatched to a remote cluster.

[ Top ]


Remote Cluster Equivalency

By default, if no cluster name is specified, LSF utilities such as lsload return information about the local cluster.

If you configure a remote cluster to be equivalent to the local cluster, LSF displays information about the remote cluster as well. For example, lsload with no options lists hosts in the local cluster and hosts in the equivalent remote clusters.

The following commands automatically display information about hosts in a remote cluster if equivalency is configured:

Performance limitation

Expect performance in a cluster to decrease as the number of equivalent clusters increases, because you must wait while LSF retrieves information from each remote cluster in turn. Defining all clusters in a large MultiCluster system as equivalent can cause a performance bottleneck as the master LIM polls all clusters synchronously.

Transparency for users

To make resources in remote clusters as transparent as possible to the user, configure a remote cluster to be equivalent to the local cluster. The users see information about the local and equivalent clusters without having to supply a cluster name to the command.

Hosts in equivalent clusters are all identified by the keyword remoteHost instead of the actual host name. For example, bjobs -p -l will show remoteHost@cluster_name instead of host_name@cluster_name.

Simplifying MultiCluster administration

If you have many clusters configured to use MultiCluster, create one cluster for administrative purposes, and configure every other cluster to be equivalent to it. This allows you to view the status of all clusters at once, and makes administration of LSF easier.

Configuration

To specify equivalent clusters, set EQUIV in the RemoteClusters section of lsf.cluster.cluster_name to Y for the equivalent clusters.

[ Top ]


Remote Resources

If you have no concerns about running only local jobs on your submission cluster, you can allow the submission forward policy to consider remote resource availability before forwarding jobs. This allows more jobs to be forwarded because more resources are available.

Configuring remote resource availability

To enable to submission forward policy to consider remote resource availability, define MC_PLUGIN_REMOTE_RESOURCE=y in lsf.conf.

Note


When MC_PLUGIN_REMOTE_RESOURCE is defined, only the following resource requirements are supported: -R "type==type_name", -R "same[type]" and -R "defined(resource_name)"

[ Top ]


Pre-Exec Retry Threshold

When a job has a pre-execution command, LSF runs the job's pre-execution command first. By default, LSF retries the pre-execution command five times.

With a threshold configured, LSF returns the entire job to the submission cluster if the pre-execution command fails to run after a certain number of attempts. The submission cluster can then reschedule the job.

Configuring pre-exec retries

To limit the number of times the local cluster attempts to run the pre-execution command, set LOCAL_MAX_PREEXEC_RETRY in lsb.params and specify the maximum number of attempts. Configure MAX_PREEXEC_RETRY or REMOTE_MAX_PREEXEC_RETRY to limit pre-execution retry attempts on the the remote cluster.

The pre-execution command retry limit configured in lsb.params, lsb.queues, and lsb.applications on the execution cluster is applied.

[ Top ]


Retry Threshold and Suspend Notification

If a job is forwarded to a remote cluster and then fails to start, it returns to the submission queue and LSF retries the job. After a certain number of failed retry attempts, LSF suspends the job (PSUSP). The job remains in that state until the job owner or administrator takes action to resume, modify, or remove the job.

By default, LSF tries to start a job up to 6 times (the threshold is 5 retry attempts). The retry threshold is configurable.

You can also configure LSF to send email to the job owner when the job is suspended. This allows the job owner to investigate the problem promptly. By default, LSF does not alert users when a job has reached its retry threshold.

Configuring retries

Set LSB_MC_INITFAIL_RETRY in lsf.conf and specify the maximum number of retry attempts. For example, to attempt to start a job no more than 3 times in total, specify 2 retry attempts:

LSB_MC_INITFAIL_RETRY = 2

Configuring mail notification

To make LSF email the user when a job is suspended after reaching the retry threshold, set LSB_MC_INITFAIL_MAIL in lsf.conf to y:

LSB_MC_INITFAIL_MAIL = y

By default, LSF does not notify the user.

[ Top ]


Pending MultiCluster Job Limit

The pending MultiCluster job limit determines the maximum number of MultiCluster jobs that can be pending in the queue. The queue rejects jobs from remote clusters when this limit is reached. It does not matter how many MultiCluster jobs are running in the queue, or how many local jobs are running or pending.

By default, the limit is 50 pending MultiCluster jobs.

Configuring a pending MultiCluster job limit

Edit IMPT_JOBBKLG in lsb.queues, and specify the maximum number of MultiCluster jobs from remote clusters that can be pending in the queue. This prevents jobs from being over-committed to an execution cluster with limited resources.

If you specify the keyword infinit, the queue will accept an infinite number of jobs.

Considerations

When you set the limit, consider the following:

Therefore, estimate your expected job flow and set the limit 50% or 100% higher than the estimate.

Example

Assume that locally submitted jobs do not occupy all the available resources, so you estimate that each processor can schedule and execute 2 MultiCluster jobs per scheduling session. To make full use of the job slots, and make sure the queue never runs out of jobs to dispatch, set the limit at 3 or 4 jobs per processor: if this queue has 20 processors, set the limit to allow 60 or 80 MultiCluster jobs pending. You expect to run about 40 of them immediately, and the remainder only wait for one scheduling cycle.

[ Top ]


Updating the Pending Reason for MultiCluster Jobs

By default, the pending reasons for MultiCluster jobs are updated every 5 minutes by the execution cluster, but the maximum amount of data transferred between clusters is 512 KB. If LSF cannot update the pending reasons for all jobs at once, it will update the additional jobs during the next cycles.

You can disable the feature or modify how often the pending reasons are updated and how much data can be transferred at one time. Depending on load, updating the information very frequently or sending an unlimited amount of information can affect the performance of LSF.

Configuring the pending reason updating interval

To change the timing of pending reason updating between clusters, set MC_PENDING_REASON_UPDATE_INTERVAL in lsb.params in the execution cluster. Specify how often to update the information in the submission cluster, in seconds.

To disable pending reason updating between clusters, specify zero:

MC_PENDING_REASON_UPDATE_INTERVAL=0

Restriction

You must configure this parameter manually; you cannot use LSF GUI tools to add or modify this parameter.

Configuring the pending reason update package size

To change the package size of each pending reason update, set MC_PENDING_REASON_PKG_SIZE in lsb.params in the execution cluster. Specify the maximum package size, in KB.

To disable the limit and allow any amount of data in one package, specify zero:

MC_PENDING_REASON_PKG_SIZE=0

This parameter has no effect if pending reason updating is disabled (MC_PENDING_REASON_UPDATE_INTERVAL=0).

Restriction

You must configure this parameter manually; you cannot use LSF GUI tools to add or modify this parameter.

[ Top ]


Remote Timeout Limit

Remote timeout limit

The remote timeout limit is set in the submission cluster and determines how long a MultiCluster job stays pending in the execution cluster. After the allowed time, the job returns to the submission cluster to be rescheduled.

The remote timeout limit in seconds is:

MAX_RSCHED_TIME(lsb.queues) * MBD_SLEEP_TIME(lsb.params)

By default, MBD_SLEEP_TIME is one minute and the multiplying factor for MultiCluster is 20, so the timeout limit is normally 20 minutes.

Problem with remote-only queues

By default, LSF queues dispatch jobs in FCFS order. However, there is one case in which the default behavior can be a problem. This is when a send-jobs queue sends to only one remote queue, and never uses local hosts.

In this case, jobs that time out in the receive-jobs cluster can only be re-dispatched to the same receive-jobs queue. When this happens, the receive-jobs queue takes the re- dispatched job as a new submission, gives it a new job ID, and gives it lowest priority in FCFS ordering. In this way, the highest-priority MultiCluster job times out and then becomes the lowest-priority job. Also, since local jobs don't time out, the MultiCluster jobs get a lower priority than local jobs that have been pending for less time.

To make sure that jobs are always dispatched in the original order, you can disable remote timeout for the send-jobs queue.

Disabling timeout

To disable remote timeout, edit MAX_RSCHED_TIME in lsb.queues in the submission cluster, and specify the keyword INFINIT. This increases the remote timeout limit to infinity.

Even if the limit is set to infinity, jobs time out if a remote execution cluster gets reconfigured. However, all the pending jobs time out at once, so when the queue attempts to send them again, the original priority is maintained.

[ Top ]


[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]


      Date Modified: March 13, 2009
Platform Computing: www.platform.com

Platform Support: support@platform.com
Platform Information Development: doc@platform.com

Copyright © 1994-2009 Platform Computing Corporation. All rights reserved.