Permanent LSF licenses are managed by the FlexNet license server daemon (lmgrd). The FlexNet license server daemon runs on a license server host you choose (for failover purposes, the daemon can run on multiple hosts).
The lmgrd daemon starts the LSF vendor license daemon lsf_ld, which periodically keeps track of how many LSF licenses are checked out and who has them. Only one lsf_ld can run on a host. If lsf_ld stops running, lmgrd immediately stops serving LSF licenses to all LSF hosts.
The LIM on the LSF master hosts contacts the license server host to get the necessary LSF licenses. It then propagates licenses to all LSF server hosts and client hosts. Multiple LSF clusters can get licenses from the same license server host.
The TIMEOUT ALL parameter in the FlexNet license option file changes timeout values, including how quickly the master host releases licenses during failover. LSF supports a minimum timeout value of 15 minutes.
Only the master LIM can check out licenses. No other part of LSF has any contact with the FlexNet license server daemon. Once LIM on the master host identifies itself as the master, it reads the LSF_CONFDIR/lsf.cluster.cluster_name file to get the host information to calculate the total number of licenses needed. LSF software is licensed per core, not per host or per cluster, so hosts with multicore processors require multiple LSF licenses.
After the cluster is properly licensed, the master LIM contacts the license server daemon periodically to confirm the availability of checked out LSF licenses.
LIM distributes the licenses needed this way:
Calculates the total number of licenses needed for the master LIM.
Before slave LIMs contact the master, calculates the total number of licenses needed for all LSF server hosts and checks them out. When the slave LIMs start, they contact the master host to get the licenses they need.
Checks out licenses needed for client hosts listed in LSF_CONFDIR/lsf.cluster.cluster_name. If the license checkout fails for any host, that host is unlicensed. The master LIM tries to check out the license later.
If the master LIM finds the license server daemon has gone down or is unreachable, LSF has a grace period before the whole cluster is unlicensed. As long as the master LIM that originally received the licenses is not restarted or shut down, the LSF cluster can run up to 60 hours without licenses. If you reconfigure LSF after the license server daemon becomes unavailable, you lose the grace period and the cluster is unlicensed because the original LIM that carries the correct license information is killed and restarted during reconfiguration. This prevents LSF from becoming a single point of failure and enables LSF to function reliably over an extended period of time (for example, over a long weekend) should the license server daemon fail.