Test Your LSF Installation

Before you make LSF available to users, make sure LSF is installed and operating correctly. You should:

  • Check the cluster configuration

  • Start the LSF daemons (LSF services)

  • Verify that your new cluster is operating correctly

If you have a mixed UNIX and Windows cluster, make sure you can perform operations from both UNIX and Windows hosts.

Check the licenses

You must have a permanent license. If you have a DEMO license, proceed to checking the cluster status.

The FLEXlm License Server service is installed as a Windows service to start automatically.

  1. Select Start > Settings > Control Panel > Services and make sure the FLEXnet License Server service is started.
  2. From the command line, check the license server status and display the number of licenses available:

    C:\lsf\7.0\etc> lmutil lmstat -a -c %LSF_ENVDIR%/license.dat

  3. From the command line, display the products that are licensed for any host in the cluster:

    C:\lsf\7.0\bin> lshosts -l hostA

Check the cluster

Before using any LSF commands, wait a few minutes for LSF services to start.

  1. Log on to any host in the cluster.
  2. Check the configuration files.

    C:\LSF_7.0> lsadmin ckconfig -v

    Typical output is as follows:

    C:\LSF_7.0>lsadmin ckconfig -v
    Checking configuration files ...
    Platform EGO 1.2.3.98817, Nov 2 2007
    Copyright (C) 1992-2007 Platform Computing Corporation
    binary type: nt-x86
    Reading configuration from C:\LSF_7.0\conf\ego\cluster1\kernel/ego.conf
    Dec 21 08:38:59 2007 4196:1492 6 7.02 Lim starting...
    Dec 21 08:38:59 2007 4196:1492 6 7.02 LIM is running in advanced workload execution mode.
    Dec 21 08:38:59 2007 4196:1492 6 7.02 Master LIM is not running in EGO_DISABLE_UNRESOLVABLE_HOST mode.
    Dec 21 08:38:59 2007 4196:1492 5 7.02 C:\LSF_7.0\7.0\etc/lim.exe -C
    Dec 21 08:38:59 2007 4196:1492 7 7.02 setMyClusterName: searching cluster files...
    Dec 21 08:38:59 2007 4196:1492 7 7.02 setMyClusterName: local host hostA belongs to cluster cluster1
    Dec 21 08:38:59 2007 4196:1492 3 7.02 domanager(): C:\LSF_7.0\conf/lsf.cluster.cluster1(13): 
    The cluster manager is the invoker <LSF\lsfadmin> in debug mode
    Dec 21 08:38:59 2007 4196:1492 6 7.02 reCheckClass: numhosts 1 so reset exchIntvl to 15.00
    Dec 21 08:38:59 2007 4196:1492 7 7.02 getDesktopWindow: no Desktop time window configured
    Dec 21 08:38:59 2007 4196:1492 6 7.02 Checking Done.
    ---------------------------------------------------------
    No errors found.
  3. Start the LSF cluster.
    1. If you have a Windows-only cluster, start the LSF cluster:
      C:\lsf\7.0\bin> lsfstartup

      This command starts the LSF services, Platform LIM, Platform RES, and Platform SBD on all LSF Windows hosts. It could take up to 20 seconds.

    2. If you have a mixed UNIX-Windows cluster, you need to log on to a UNIX host and start the UNIX daemons with lsfstartup, and then log on to a Windows host and use lsfstartup from a Windows host to start LSF services on all Windows hosts.
  4. Display the cluster name and master host name:

    lsid

Check the Load Information Manager (LIM)

  1. Display cluster configuration information about resources, host types, and host models:

    lsinfo

    The information displayed by lsinfo is configured in LSF_CONFDIR\lsf.shared.

  2. Display configuration information and status of LSF hosts:

    lshosts

    The output contains one line for each host in the cluster. Type, model, and resource information is configured in the LSF_CONFDIR\lsf.cluster.cluster_name file. The cpuf matches the CPU factor given for the host model in LSF_CONFDIR\lsf.shared.

  3. Display the current load levels of the cluster:

    lsload

    The output contains one line for each host in the cluster. The status should be ok for all hosts in your cluster.

Check the Remote Execution Server (RES)

You must use your user password using lspasswd.

  1. Run a command on one LSF host, using the RES:

    lsrun -v -m hostA hostname

  2. Run a command on a group of hosts, using the RES:

    lsgrun -v -m "hostA hostB hostC" hostname

  3. Check for OK status on cross-cluster configuration information:

    lsclusters -l

LSF on Platform EGO

LSF on Platform EGO allows EGO to serve as the central resource broker, enabling enterprise applications to benefit from sharing of resources across the enterprise grid.

How to handle parameters in lsf.conf with corresponding parameters in ego.conf

When EGO is enabled, existing LSF parameters (parameter names beginning with LSB_ or LSF_) that are set only in lsf.conf operate as usual because LSF daemons and commands read both lsf.conf and ego.conf.

Some existing LSF parameters have corresponding EGO parameter names in ego.conf (LSF_CONFDIR\lsf.conf is a separate file from LSF_CONFDIR\ego\cluster_name\kernel\ego.conf). You can keep your existing LSF parameters in lsf.conf, or your can set the corresponding EGO parameters in ego.conf that have not already been set in lsf.conf.

You cannot set LSF parameters in ego.conf, but you can set the following EGO parameters related to LIM, PIM, and ELIM in either lsf.conf or ego.conf:
  • EGO_DAEMONS_CPUS

  • EGO_DEFINE_NCPUS

  • EGO_SLAVE_CTRL_REMOTE_HOST

  • EGO_WORKDIR

  • EGO_PIM_SWAP_REPORT

You cannot set any other EGO parameters (parameter names beginning with EGO_) in lsf.conf. If EGO is not enabled, you can only set these parameters in lsf.conf.

Note:

If you specify a parameter in lsf.conf and you also specify the corresponding parameter in ego.conf, the parameter value in ego.conf takes precedence over the conflicting parameter in lsf.conf.

If the parameter is not set in either lsf.conf or ego.conf, the default takes effect depends on whether EGO is enabled. If EGO is not enabled, then the LSF default takes effect. If EGO is enabled, the EGO default takes effect. In most cases, the default is the same.

Some parameters in lsf.conf do not have exactly the same behavior, valid values, syntax, or default value as the corresponding parameter in ego.conf, so in general, you should not set them in both files. If you need LSF parameters for backwards compatibility, you should set them only in lsf.conf.

If you have LSF 6.2 hosts in your cluster, they can only read lsf.conf, so you must set LSF parameters only in lsf.conf.

LSF and EGO corresponding parameters

The following table summarizes existing LSF parameters that have corresponding EGO parameter names. You must continue to set other LSF parameters in lsf.conf.

lsf.conf parameter

ego.conf parameter

LSF_API_CONNTIMEOUT

EGO_LIM_CONNTIMEOUT

LSF_API_RECVTIMEOUT

EGO_LIM_RECVTIMEOUT

LSF_CLUSTER_ID (Windows)

EGO_CLUSTER_ID (Windows)

LSF_CONF_RETRY_INT

EGO_CONF_RETRY_INT

LSF_CONF_RETRY_MAX

EGO_CONF_RETRY_MAX

LSF_DEBUG_LIM

EGO_DEBUG_LIM

LSF_DHPC_ENV

EGO_DHPC_ENV

LSF_DYNAMIC_HOST_TIMEOUT

EGO_DYNAMIC_HOST_TIMEOUT

LSF_DYNAMIC_HOST_WAIT_TIME

EGO_DYNAMIC_HOST_WAIT_TIME

LSF_ENABLE_DUALCORE

EGO_ENABLE_DUALCORE

LSF_GET_CONF

EGO_GET_CONF

LSF_GETCONF_MAX

EGO_GETCONF_MAX

LSF_LIM_DEBUG

EGO_LIM_DEBUG

LSF_LIM_PORT

EGO_LIM_PORT

LSF_LOCAL_RESOURCES

EGO_LOCAL_RESOURCES

LSF_LOG_MASK

EGO_LOG_MASK

LSF_MASTER_LIST

EGO_MASTER_LIST

LSF_PIM_INFODIR

EGO_PIM_INFODIR

LSF_PIM_SLEEPTIME

EGO_PIM_SLEEPTIME


Parameters that have changed in LSF

The default for LSF_LIM_PORT has changed to accommodate EGO default port configuration. On EGO, default ports start with lim at 7869, and are numbered consecutively for pem, vemkd, and egosc.

This is different from previous LSF releases where the default LSF_LIM_PORT was 6879. res, sbatchd, and mbatchd continue to use the default pre-version 7 ports 6878, 6881, and 6882.

Upgrade installation preserves existing port settings for lim, res, sbatchd, and mbatchd. EGO pem, vemkd, and egosc use default EGO ports starting at 7870, if they do not conflict with existing lim, res, sbatchd, and mbatchd ports.

EGO connection ports and base port

On every host, a set of connection ports must be free for use by LSF and EGO components.

LSF and EGO require exclusive use of certain ports for communication. EGO uses the same four consecutive ports on every host in the cluster. The first of these is called the base port.

The default EGO base connection port is 7869. By default, EGO uses four consecutive ports starting from the base port. By default, EGO uses ports 7869-7872.

The ports can be customized by customizing the base port. For example, if the base port is 6880, EGO uses ports 6880-6883.

LSF and EGO needs the same ports on every host, so you must specify the same base port on every host.

Check LSF

The LIM and mbatchd must be running on the master host and on the submission host (the host from which you run the command).

  1. Verify the LSF daemon configuration:

    C:\LSF_7.0>badmin ckconfig -v

    The message No errors found. displays.

  2. Run some basic commands and check the status: OK (hosts) and Open:Active (queues):

    bhosts

    bqueues

  3. Display the default queue:

    C:\lsf\bin>bparams

  4. Submit a test job to the default queue named normal:

    C:\lsf\7.0\bin> bsub sleep 60

    Job <1> is submitted to default queue <normal>.

  5. Display the job status:

    C:\lsf\7.0\bin> bjobs

    If all hosts are busy, the job is not started immediately and the STAT column says PEND. The job sleep 60 should take one minute to run. When the job completes, LSF sends mail reporting the job completion.

Test the Platform Management Console (PMC)

  1. Browse to the web server URL and log in to the PMC as user Admin with password Admin.
    • If you have only one management host (the master host), the web server URL is http://master_host:8080/platform.

    • If you have multiple management hosts, locate the web server:
      1. Log on as lsfadmin and run egosh client view.

        This command locates the PMC. It is only needed if EGO is enabled.

      2. Scan the client list for a name preceded by GUIURL, such as GUIURL_HostW.

      3. The additional information shows the web server URL; for example, http://Host_W:8080/platform.

  2. As a security measure, use the PMC to change the Admin and Guest account passwords from the simple default passwords, Admin and Guest.