Configure a dedicated queue for floating licenses

With a dedicated queue to run jobs requiring a floating software license, LSF reserves a software license before dispatching each job, and releases the license when the job finishes.

  1. Configure a dedicated queue to run jobs requiring a floating software license.

    The following example defines a queue named q_verilog in lsb.queues dedicated to jobs that require Verilog licenses:

    Begin Queue 
    QUEUE_NAME = q_verilog 
    RES_REQ=rusage[verilog=1:duration=1] 
    End Queue

    The queue named q_verilog contains jobs that reserve one Verilog license when started.

    If the Verilog licenses are not cluster-wide, but can only be used by some hosts in the cluster, the resource requirement string should include the defined() tag in the select section:

    select[defined(verilog)] rusage[verilog=1]
  2. Run bhosts -s command to display the number of licenses being reserved by the dedicated queue.

Prevent underutilization of licenses

If a job submitted to the dedicated queue requiring a floating license does not actually use the license, then licenses can be under-utilized.

LSF assumes that each job indicating that it requires a license actually uses it, and subtracts the total number of jobs requesting specific licenses from the total number available.

Use the duration keyword in the queue resource requirement specification to release the shared resource after the specified number of minutes expires.

By limiting the duration of the reservation and using the actual license usage as reported by the ELIM, underutilization is also avoided and licenses used outside of LSF can be accounted for.

When interactive jobs compete for licenses

In situations where an interactive job outside the control of LSF competes with batch jobs for a software license, it is possible that a batch job, having reserved the software license, may fail to start as its license is intercepted by an interactive job. To handle this situation,

Configure job requeue for each queue using REQUEUE_EXIT_VALUES in lsb.queues.

If a job exits with one of the values in REQUEUE_EXIT_VALUES, LSF requeues the job.

For example, jobs submitted to the following queue use Verilog licenses:

Begin Queue 
QUEUE_NAME = q_verilog 
RES_REQ=rusage[verilog=1:duration=1] 
# application exits with value 99 if it fails to get license 
REQUEUE_EXIT_VALUES = 99 
JOB_STARTER = lic_starter 
End Queue

All jobs in the queue are started by the job starter lic_starter, which checks if the application failed to get a license and exits with an exit code of 99. This exit code causes the job to be requeued.

lic_starter job starter script

The lic_starter job starter can be coded as follows:

#!/bin/sh 
# lic_starter: If application fails with no license, exit 99, # otherwise, exit 0. The application displays 
# "no license" when it fails without license available. 
$* 2>&1 | grep "no license" if [ $? != "0" ] then    exit 0     
# string not found, application got the license else    exit 99 fi