This page is only intended to give you the necessary and sufficient details
to run a simple multiprocessor job in the CS environment, and by means of a
few examples, get you started on the learning curve. For more details
on specific tools, see the relevant man pages. For a thorough introduction to
MPI, we recommend the NCSA Cyberinfrastructure Tutor.
Using Open MPI on Odin
Here is a list of essentials for running Open MPI under SLURM.
- Open MPI is installed as a module (LAM/MPI is still the default) so you must first load the Open MPI module: "module load /san/modules/modulefiles/mpi/openmpi-1.2b3-gcc".
- Compile your MPI application using mpicc (C code) or mpiCC (C++ code).
- Use SLURM to allocate a set of nodes. srun is normally used to run jobs in
batch mode, but for getting acquainted with the process "salloc -N4 xterm"
will allow you to try things interactively.
- Use "mpirun a.out" to run an MPI application. By default Open MPI will run one copy of the application in each available "slot" provided by SLURM. On Odin, each node has two dual-core processors and thus four slots. Use "mpirun -bynode a.out" to run one copy on each allocated node, or "mpirun -n x a.out" to run x copies distributed over the available slots.
- If you are using SLURM interactively, relinquish your allocated nodes
by exiting the interactive shell.
Example 1: Parallel Hello World!
Hello_MPI.c contains the bare essentials of an MPI
program: "#include <mpi.h>" and the function calls "MPI_Init(&argc, &argv);" and
"MPI_Finalize();" bracketing all other MPI code. To try it out, you will need to do
the following steps (while logged into odin):
The output should be a line from each node like this:
Hello World! from Process m of n on odin000.cs.indiana.edu
Odin's nodes have two dual-core CPUs each; to use all available process slots, try
mpirun Hello
Example 2: Distributing Computation
Sample programs sum_a.c and sum_b.c demonstrate two ways of distributing a job over multiple processors and provide examples of the use of MPI_Send and MPI_Recv. The operation is to compute the sum of a range of positive integers starting at zero. In sum_a each node independently performs its part of the computation and sends its result to a master node where the partial results are aggregated. In sum_b the master node both assigns subtasks to other nodes and aggregates the results.
Using LAM/MPI on Odin
Here is a list of essentials for running LAM/MPI under SLURM.
- SSH authentication is required in order to run your job on multiple nodes.
In your $HOME directory, run "ssh-keygen -t rsa", and in $HOME/.ssh run
"ln id_rsa.pub authorized_keys".
- Login scripts must not produce any output for non-interactive logins.
- Compile your MPI application using mpicc (C code) or mpiCC (C++ code).
- Use SLURM to allocate a set of nodes. srun is normally used to run jobs in
batch mode, but for getting acquainted with the process "salloc -N4 xterm"
will allow you to try things interactively.
- A LAM/MPI session starts with running "lamboot bhost" in your SLURM
allocation. bhost is a file in your working directory that lists
odin.cs.indiana.edu plus exactly the nodes allocated by SLURM (check
environment variable SLURM_NODELIST, or use this script
to produce a bhost file).
- Use "mpirun C a.out" to run an MPI application.
- End your LAM/MPI session by running "lamhalt".
- If you are using SLURM interactively, relinquish your allocated nodes
by exiting the interactive shell.
Example 1: Parallel Hello World!
Hello_MPI.c contains the bare essentials of an MPI
program: "#include <mpi.h>" and the function calls "MPI_Init(&argc, &argv);" and
"MPI_Finalize();" bracketing all other MPI code. To try it out, you will need to do
the following steps (while logged into odin):
The output should be a line from each node (including the headnode, odin) like this:
Hello World! from Process m of n on odin000.cs.indiana.edu
Odin's nodes have two CPUs each; to use both, try
mpirun C Hello
The bhost file includes the notation "cpu=2" for each node, which is interpreted by mpirun
as "this node can run two processes". The LAM/MPI environment has no real grasp of CPUs and
will happily run as many processes as you ask on any node, leaving the OS to decide how
to schedule them.