"file" command for this.
Head nodes have 8 GB memory, compute nodes have 4 GB each. Storage is primarily on a SAN which has a 3.5 terabyte capacity. If you want to use parallel I/O, it would have to be writing separate files to /tmp on each machine. Otherwise I/O will be limited by the 10GB/sec Infiniband switch.
srun -N 2 -A
where this gives you 2 nodes (-n 4 gives 4 processors.) This will start
a new shell for you on the head node, and exiting that shell will kill
your reservation of nodes automatically. The default allocation time
is for four days.
squeue | grep trainXXX
where trainXXX is your assigned login name.
LAM MPI is the default (/usr/bin) version, but for the CCA tutorial you will need to use the one built in /san/cca/mpich_gcc_intel_PIC . [I should set this up in the default path for each trainee account before hand].
secretword=whatever
Don't use a good password, this is not cryptographically secure.
chmod 600 .mpd.conf
mpdboot -n 2
starts 2 node daemons (which is 4 processors)
mpdtrace -l
shows which ones were grabbed; it´s a good idea to check to make sure the
daemons were successfully started.
/san/mpich2/bin/mpdrun -l -np 4 /san/cca/trainXXX/a.out
&
mpdallexit
to kill off the daemons.
CCA requires versions built with -fPIC and dynamic libraries. LAM does not support Fortran 90 modules, which occur in the CCA tutorial code. The MPI version to use is in /san/cca/mpich_gcc_intelPIC, and it was compiled with GCC 3.4.3 for C and C++, Intel ver. 9.0 Fortran90. ROMIO and the MPE libraries were also built in that directory.
odin120
odin121
...
/san/cca/mpich_intel_PIC/bin/mpirun -np 4 -machine_list machines a.out
where a.out is the MPI executable. Specifying the full path will help prevent
the common mistake of compiling with one MPI and using the launch mechanism of
another.
The overall OS is Gentoo Linux. Which may be partly responsible for ...
Libraries are sometimes in unexpected places on odin. We have
GCC 4.x has not yet been successfully built on odin, and it may be because of some nonstandard placement of the libraries. If possible Bramley will try to build it and create a mpich_gfortran that uses all GCC compilers.
To release the queue:
a) The queue is automatically exited if you are running in interactive mode
or
b) use scancel to cancel the reservation.
We intend to keep your accounts and files intact for at least two weeks after the end of the tutorial, to give you time to transfer them over to your home machine for future delectation. After that, I will probably delete all .o, .so, and .a files and then tar the whole thing over to our HPSS tape system, just in case someone needs the files later. Be sure to not leave anything private laying around in the directories like your credit card numbers or the bank account that you gave the nice Nigerian gentleman a few weeks ago.