[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
- About Platform LSF HPC and MPICH-P4
- Configuring LSF HPC to Work with MPICH-P4
- Submitting MPICH-P4 Jobs
[ Top ]
About Platform LSF HPC and MPICH-P4
MPICH is a freely available, portable implementation of the MPI Standard for message- passing libraries, developed jointly with Mississippi State University. MPICH is designed to provide high performance, portability, and a convenient programming environment.
MPICH-P4 is an MPICH implementation for the ch_p4 device, which supports SMP nodes, MPMD programs, and heterogeneous collections of systems.
Requirements
Assumptions and limitations
- MPICH-P4 is installed and configured correctly
- The user's current working directory is part of a shared file system reachable by all hosts
- The directory specified by the MPICH_HOME variable is accessible by the same path on all hosts
- Process group files are not supported. The
mpich.ch_p4 p4pg
option is ignored.Glossary
(Message Passing Interface) A message passing standard. It defines a message passing API useful for parallel and distributed applications.
A portable implementation of the MPI standard.
An MPI implementation based on MPICH for the chp4 device.
(Parallel Application Manager) The supervisor of any parallel job.
(Parallel Job Launcher) Any executable script or binary capable of starting parallel tasks on all hosts assigned for a parallel job.
(Remote Execution Server) An LSF daemon residing on each host. It monitors and manages all LSF tasks on the host.
(TaskStarter) An executable responsible for starting a task on the local host and reporting the process ID and host name to the PAM.
For more information
- See the Mathematics and Computer Science Division (MCS) of Argonne National Laboratory (ANL) MPICH Web page at
www-unix.mcs.anl.gov/mpi/mpich/
for more information about MPICH and MPICH-P4.Files installed by lsfinstall
During installation,
lsfinstall
copies these files to the following directories:
These files... Are installed to... TaskStarter
LSF_BINDIR
pam
LSF_BINDIR
esub.mpichp4
LSF_SERVERDIR
mpichp4_wrapper
LSF_BINDIR
mpirun.lsf
LSF_BINDIR
pjllib.sh
LSF_BINDIR
Resources and parameters configured by lsfinstall
- External resources in
lsf.shared
:Begin Resource RESOURCE_NAME TYPE INTERVAL INCREASING DESCRIPTION ... mpichp4 Boolean () () (MPICH P4 MPI) ... End ResourcesThe
mpichp4
Boolean resource is used for mapping hosts with MPICH-P4 available.
You should add thempichp4
resource name under the RESOURCES column of the Host section oflsf.cluster.
cluster_name.
- Parameter to
lsf.conf
:LSB_SUB_COMMANDNAME=y[ Top ]
Configuring LSF HPC to Work with MPICH-P4
mpichp4_wrapper script
Modify the
mpichp4_wrapper
script inLSF_BINDIR
to set MPICH_HOME. The default is:MPICH_HOME="/opt/mpich-1.2.5.2-ch_p4/"[ Top ]
Submitting MPICH-P4 Jobs
bsub command
Use
bsub
to submit MPICH-P4 jobs.bsub -a mpichp4 -n
number_cpusmpirun.lsf
[-pam "
pam_options"
] [mpi_options] job [job_options]
-a mpichp4
tellsesub
the job is an MPICH-P4 job and invokesesub.
mpichp4
.-n
number_cpus specifies the number of processors required to run the job
mpirun.lsf
reads the environment variable LSF_PJL_TYPE=mpichp4 set byesub.
mpichp4
, and generates the appropriatepam
command line to invoke MPICH-P4 as the PJLFor example:
% bsub -a mpichp4 -n 3 mpirun.lsf /examples/cpi
A job named
cpi
will be dispatched and run on 3 CPUs in parallel.
- To start the P4 secure-server, run the following command:
% $MPICH_HOME/bin/serv_p4 -o -p portwhere port is the port number of the MPICH-P4 secure server.
- Submit your job with the
-p4ssport
option using the following syntax:bsub -a mpichp4 -n
number_cpusmpirun.lsf
[-pam "
pam_options"
] [mpi_options]-p4ssport
port job [job_options]where port is the port number of the MPICH-P4 secure server.
You must specify full path for the job.
See the MPICH-P4 documentation for more information about the
p4ssport
secure servermpirun.ch_p4
command option.Task geometry with MPICH-P4 jobs
MPICH-P4
mpirun
requires the first task to run on local node OR all tasks to run on remote node (-nolocal
). If the LSB_PJL_TASK_GEOMETRY environment variable is set,mpirun.lsf
makes sure the task group that contains task 0 in LSB_PJL_TASK_GEOMETRY runs on the first node.The environment variable LSB_PJL_TASK_GEOMETRY is checked for all parallel jobs. If LSB_PJL_TASK_GEOMETRY is set users submit a parallel job (a job that requests more than 1 slot), LSF attempts to shape LSB_MCPU_HOSTS accordingly.
Submitting a job with a job script
You can submit a job using a job script as an embedded script or directly as a job, for example:
% bsub -a mpichp4 -n 4 < embedded_jobscript
% bsub -ampichp4
-n 4
jobscriptYour job script must use
mpirun.lsf
in place of thempirun
command.For information on generic PJL wrapper script components, see Running Parallel Jobs.
See Administering Platform LSF for information about submitting jobs with job scripts.
[ Top ]
[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Date Modified: March 13, 2009
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2009 Platform Computing Corporation. All rights reserved.