Learn more about Platform products at http://www.platform.com

[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]



Using Platform LSF HPC with LAM/MPI


Contents

[ Top ]


About Platform LSF HPC and LAM/MPI

LAM (Local Area Multicomputer) is an MPI programming environment and development system for heterogeneous computers on a network. With LAM, a dedicated cluster or an existing network computing infrastructure can act as one parallel computer solving one problem.

System requirements

Assumptions

Glossary

LAM

(Local Area Multicomputer) An MPI programming environment and development system for heterogeneous computers on a network.

MPI

(Message Passing Interface) A message passing standard. It defines a message passing API useful for parallel and distributed applications.

PAM

(Parallel Application Manager) The supervisor of any parallel job.

PJL

(Parallel Job Launcher) Any executable script or binary capable of starting parallel tasks on all hosts assigned for a parallel job.

RES

(Remote Execution Server) An LSF daemon residing on each host. It monitors and manages all LSF tasks on the host.

TS

(TaskStarter) An executable responsible for starting a task on the local host and reporting the process ID and host name to the PAM.

Files installed by lsfinstall

During installation, lsfinstall copies these files to the following directories:

These files... Are installed to...
TaskStarter
LSF_BINDIR
pam
LSF_BINDIR
esub.lammpi
LSF_SERVERDIR
lammpirun_wrapper
LSF_BINDIR
mpirun.lsf
LSF_BINDIR
pjllib.sh
LSF_BINDIR

Resources and parameters configured by lsfinstall

[ Top ]


Configuring LSF HPC to work with LAM/MPI

System setup

  1. For troubleshooting LAM/MPI jobs, edit the LSF_BINDIR/lammpirun_wrapper script, and specify a log directory that all users can write to. For example:
    LOGDIR="/mylogs"
    
    


    Do not use LSF_LOGDIR for this log directory.

  2. Add the LAM/MPI home directory to your path. The LAM/MPI home directory is the directory that you specified as the prefix during LAM/MPI installation.
  3. Add the path to the LAM/MPI commands to the $PATH variable in your shell startup files ($HOME/.cshrc or $HOME/.profile).
  4. Edit lsf.cluster.cluster_name and add the lammpi resource for each host with LAM/MPI available. For example:
    Begin   Host
    HOSTNAME  model  type   server r1m  mem  swp  RESOURCES
    ...
    hosta      !      !        1   3.5  ()  ()    (lammpi)
    ...
    End     Host
    

[ Top ]


Submitting LAM/MPI Jobs

bsub command

Use bsub to submit LAM/MPI jobs:

bsub -a lammpi -n number_cpus [-q queue_name] mpirun.lsf 
[-pam "pam_options"] [mpi_options] job [job_options]

Examples

Submitting a job with a job script

A wrapper script is often used to call the LAM/MPI script. You can submit a job using a job script as an embedded script or directly as a job, for example:

% bsub -a lammpi -n 4 < embedded_jobscript
% bsub -a lammpi -n 4 jobscript

Your job script must use mpirun.lsf in place of the mpirun command.

For information on generic PJL wrapper script components, see Running Parallel Jobs.

See Administering Platform LSF for information about submitting jobs with job scripts.

Job placement with LAM/MPI jobs

The mpirun -np option is ignored. You should use the LSB_PJL_TASK_GEOMETRY environment variable for consistency with other Platform LSF HPC MPI integrations. LSB_PJL_TASK_GEOMETRY overrides the mpirun -np option.

The environment variable LSB_PJL_TASK_GEOMETRY is checked for all parallel jobs. If LSB_PJL_TASK_GEOMETRY is set users submit a parallel job (a job that requests more than 1 slot), LSF attempts to shape LSB_MCPU_HOSTS accordingly.

Log files

For troubleshooting LAM/MPI jobs, define LOGDIR in the LSF_BINDIR/lammpirun_wrapper script. Log files (lammpirun_wrapper.job[job_ID].log) are written to the LOGDIR directory. If LOGDIR is not defined, log messages are written to /dev/null.

For example, the log file for the job with job ID 123 is:

lammpirun_wrapper.job123.log

[ Top ]


[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]


      Date Modified: August 20, 2009
Platform Computing: www.platform.com

Platform Support: support@platform.com
Platform Information Development: doc@platform.com

Copyright © 1994-2009 Platform Computing Corporation. All rights reserved.