|
|
Linux » Books » Developer »
Message Passing Toolkit (MPT) User's Guide
(document number: 007-3773-012 / published: 2009-10-22)
table of contents | additional info | download find in page
Chapter 2. Administrating MPT
This chapter is provided for system administrators who install,
configure, and administrate software on SGI Altix systems. It covers the
following topics:
Finding the MPT Release Notes
Find the latest MPT release notes on your system, as follows: % rpm -qi sgi-mpt | grep README.relnotes
/opt/sgi/mpt/mpt-1.25/doc/README.relnotes |
Next, change directory to the location found, and list the contents
of the directory, as follows: % cd /opt/sgi/mpt/mpt-1.25/doc
% ls
MPT_UG README.relnotes sgi-mpt.1.25.template |
The release
notes are in a file called README.relnotes.
This section describes requirements and procedures
for MPT installation. After you have installed the MPT and prerequisite
software per the instructions in this section, be sure to perform the
steps described in “System Configuration”.
 | Note: The MPI installation and configuration information found
in this chapter is also available in the READ.relnotes
file in the /opt/sgi/mpt/mpt-1.25/doc directory.
|
Disk
space requirements for the MPT product are substantially less than 20
Mbytes.
This section
describes software prerequisites for MPT.
A default install of SGI ProPack is recommended. This provides a
number of software components required by MPT. The SGI ProPack RPMs required
by MPT include the following:
InfiniBand Software Stack
If you are using the InfiniBand interconnect, you must ensure that
one of the supported InfiniBand software stacks are installed. These include
Voltaire ibhost or Gridstack software packages and
the OpenFabrics Enterprise Distribution (OFED) software provided with
SGI ProPack 5 SP2 (or later) on SGI Altix XE and SGI Altix ICE systems
and SGI ProPack 5 SP3 (or later) on SGI Altix 4000 series systems.
MPT is supplied as
an RPM file. The name of the file contains the following information:
For example, if the name of the MPT RPM for the MPT 1.25 release
is sgi-mpt-1.25-sgi605r1.ia64.rpm. To install this
RPM, log in as root and issue the following command: # rpm -Uvh sgi-mpt-1.25-sgi605r1.ia64.rpm |
Installing MPT Software in an Alternate Location
RPM provides a means for
creating, installing, and managing relocatable packages. That is, the
MPT RPM can be installed in either a default or alternate location.
The default location for installing the MPT RPM is /opt/sgi/mpt/
. To install the MPT RPM in an alternate location, use the
--prefix option, as shown in the following example. The
--prefix option specifies the alternate base directory for the
installation of the MPT software (in this case, /tmp).
# rpm -i --relocate /opt/sgi/mpt/mpt-1.25=/tmp --excludepath /usr sgi-mpt-1.25-sgi605r1.ia64.rpm |
 | Note: If the MPT software is installed in an alternate location,
MPT users must set the environment variables PATH and
LD_LIBRARY_PATH to specify the search locations for the
mpirun command and the run-time libraries, assuming the alternate
location of /tmp, as follows:setenv PATH /tmp/bin:${PATH}
export PATH=/tmp/bin:${PATH}
export LD_LIBRARY_PATH=/tmp/lib
export LD_LIBRARY_PATH=/tmp/lib |
|
If the site is using environment modules to manage the user environment,
then the alternate location should be placed in the mpt
modulefile. This approach is the most convenient way to establish environment
variable settings that enable MPT program developers and users to access
the MPT software when installed in an alternate location. Sample
modulefiles are located in /opt/sgi/mpt/mpt-1.25/doc
and /usr/share/modules/modulefiles/mpt/1.25
.
For more information, see "Using Dynamic Shared Libraries to Run
MPI Jobs," later in this chapter.
Using a cpio File for Installation
The cpio
file installation method described here is useful when the
MPT software is installed in an NFS filesystem shared by a number of hosts.
In this case, it is not important or desirable for the RPM database on
only one of the machines to track the versions of MPT that are installed.
Another advantage of the approach is that you do not need root permission
to install the MPT software.
To install MPT using a cpio file, first convert
the MPT RPM to a cpio file by executing the
rpm2cpio command, as follows: % rpm2cpio sgi-mpt-1.25-1.ia64.rpm > /tmp/sgi-mpt.cpio |
Once you have created the .cpio file, you are
free to install the software beneath any directory in which you have write
permission. The following example demonstrates the process. % cd /tmp
% cpio -idmv < sgi-mpt.cpio
opt/sgi/mpt/mpt-1.25/bin/mpirun
opt/sgi/mpt/mpt-1.25/include/mpi++.h
opt/sgi/mpt/mpt-1.25/include/mpi.h
...
opt/sgi/mpt/mpt-1.25/lib/libmpi++.so
opt/sgi/mpt/mpt-1.25/lib/libmpi.so
opt/sgi/mpt/mpt-1.25/lib/libxmpi.so
...
% ls -R /tmp/opt/sgi/mpt/mpt-1.25
bin doc include lib man
/tmp/opt/sgi/mpt/mpt-1.25/bin:
mpirun
/tmp/opt/sgi/mpt/mpt-1.25/include:
MPI.mod mpi.h mpi_ext.h mpif.h mpio.h mpp
mpi++.h mpi.mod mpi_extf.h mpif_parameters.h mpiof.h
/tmp/opt/sgi/mpt/mpt-1.25/lib:
libmpi++.so* libmpi.so* libsma.so* libxmpi.so*
... |
If the MPT software is installed in an alternate location, set up
an environment module to set environment variables which will be used
by compilers, linkers, and runtime loaders to reference the MPT software.
Installation Conflicts with Multiple MPIs
The MPT and LAM MPI
RPMs conflict with each other. To install both MPT and LAM MPI on the
same system, you can install the MPT RPM in an alternate location as described
previously “Installing MPT Software in an Alternate Location”.
Using Dynamic Shared Libraries to Run MPI Jobs
After
you have installed the MPT RPM as default, use the following command to
build an MPI-based application that uses the .so files:
For C programs, as follows: % gcc simple1_mpi.c -lmpi
% mpirun -np 2 a.out |
For Fortran programs: % f77 -I/usr/include simple1_mpi.f -lmpi
% mpirun -np 2 a.out |
The default locations for the include and
.so files and the mpirun command are referenced
automatically.
Assuming that the MPT package has been installed in an alternate
location (under the /tmp directory), as described earlier in “Installing MPT Software in an Alternate Location”,
the commands to compile, load, and check are, as follows: % gcc -I /tmp/usr/include simple1_mpi.c -L/tmp/usr/lib -lmpi
% ldd a.out
libmpi.so => /usr/lib/libmpi.so (0x40019000)
libc.so.6 => /lib/libc.so.6 (0x402ac000)
libdl.so.2 => /lib/libdl.so.2 (0x4039a000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) |
As shown above, compiling with alternate-location libraries does
not mean that your program will run with them. Note that libmpi.so
is resolved to /usr/lib/libmpi.so, which
is the default-location library. If you are going to use an alternate
location for the .so files, it is important to set
the LD_LIBRARY_PATH environment variable. If the site
is using environment modules, this can be done in the mpt
modulefile. Otherwise, the user must set LD_LIBRARY_PATH
, as in the following example: % setenv LD_LIBRARY_PATH /tmp/usr/lib
% ldd a.out
libmpi.so => /tmp/usr/lib/libmpi.so (0x40014000)
libc.so.6 => /lib/libc.so.6 (0x402ac000)
libdl.so.2 => /lib/libdl.so.2 (0x4039a000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) |
This example shows the library being resolved to the correct file.
Running MPI Jobs on a Cluster with MPT Alternate Installation
For MPI jobs to run correctly in a cluster environment in which
MPT has been installed in an alternate location, you must copy all of
the pertinent pieces of MPT to an NFS-mounted filesystem. This is the
only way in which all of the nodes in the cluster can access the software,
short of installing the same MPT RPM on each node. The following method
is one way to accomplish this (assuming /data/nfs is an NFS-mounted directory
and MPT has been installed in the alternate location /tmp/usr
): node1 # tar cf /tmp/mpt.1.25.tar /tmp/usr
node1 # cp /tmp/mpt.1.25.tar /data/nfs
node1 # cd /data/nfs
node1 # tar xf mpt.1.25.tar
node1 # setenv LD_LIBRARY_PATH /data/nfs/lib
node1 # /data/nfs/bin/mpirun -v -a <arrayname> host_A,host_B -np 1 a.out |
Replace
the < arrayname> in the above example with
an array services array name that contains both host_A and host_B.
This section describes additional system configuration issues that
a system administrator may need to address before running the SGI MPT
software.
Starting Prerequisite Services
MPT requires that procset
and array services be started and that the XPMEM kernel module be loaded.
These tasks are performed automatically by a reboot of the system occurring
after the system configuration tasks in this section have been performed.
If a reboot has not been performed, the following commands should be executed
by root: modprobe xpmem
/etc/init.d/procset restart
/etc/init.d/arrayd restart |
If you will be running MPT on a clustered system, these steps (or
a reboot) must be performed for all hosts in the cluster.
Configuring Array Services
Array Services must be configured
and running on all hosts in a cluster to perform the launch of MPI jobs.
You can set up a simple Array Services configuration by executing the
following two commands as root on all hosts of the cluster. List all host
names on the arrayconfig command line: /usr/sbin/arrayconfig -m host1 host2 ...
/etc/init.d/array restart |
For a more elaborate configuration, consult the arrayconfig
(1) and arrayd.conf(4) man pages and the
"Installing and Configuring Array Services" section of the
Linux Resource Administration Guide.
Adjusting File Descriptor Limits
On large hosts with hundreds
of processors, MPI jobs require a large number of file descriptors. On
these systems you might need to increase the system-wide limit on the
number of open files. The default value for the file-limit resource is
1024. To change the default value for all users to 8192 file descriptors: Add the following line to /etc/pam.d/login: session required /lib/security/pam_limits.so |
Add the following lines to /etc/security/limits.conf
: * soft nofile 8192
* hard nofile 8192 |
The default 1024 file descriptors will allow for approximately 199
MPI processes per host. Increasing the value to 8192 allows for more
than 512 MPI processes per host.
If other login methods are used (ssh,
rlogin, and so on), and the increased file descriptor limits
are desired, the corresponding files in /etc/pam.d
should be modified as well.
Adjusting Locked Memory Limits
The OFED-based InfiniBand
software stacks require the resource limit for locked memory to be set
to a high value.
Increase the user hard limit by adding the following line to
/etc/security/limits.conf:
If you are running on a system with an SGI ProPack software release
prior to SGI ProPack 5 Service Pack 1, you will also need to patch the
Array Services startup script /etc/init.d/array to
ensure that arrayd is running with a high "
memlock" hard limit. This is done by the following sequence,
executed as root: sed -i.bak 's/ulimit -n/ulimit -l unlimited ; ulimit -n/' \
/etc/init.d/array
/etc/init.d/array restart |
Message Passing Toolkit (MPT) User's Guide
(document number: 007-3773-012 / published: 2009-10-22)
table of contents | additional info | download
Front Matter
New Features in This Manual
About This Manual
Chapter 1. Introduction
Chapter 2. Administrating MPT
Chapter 3. Getting Started
Chapter 4. Programming with SGI MPI
Chapter 5. Debugging MPI Applications
Chapter 6. Profiling MPI Applications
Chapter 7. Run-time Tuning
Chapter 8. MPI Performance Profiling
Chapter 9. Troubleshooting and Frequently Asked Questions
Index
home/search |
what's new |
help
|
|
|