TCM
UoC crest

MPI

In order to facilitate code development, TCM supports a number of different MPI systems. However, as there is no standard concerning the detailed implementation of MPI any attempt to mix and match causes failure. The include files at compile time, the libraries at link time, and the mpirun command used at run time, must all match.

In order to cut down the number of combinations, we do not support the use of MPI between different machines. MPI in TCM is intended for use on multicore computers, or dedicated clusters, only.

This page describes MPI as found on TCM's desktops and s series machines. There is a separate page of brief benchmarks.

Compiler Support and MPI versions

Currently TCM has versions of both OpenMPI and MPICH. It is important to use the same version at compile, link and run time. In general minor versions can be mixed (the last digit of the version number), but mixing OpenMPI 1.6.x and 1.8.y, for instance, does not seem to be possible.

Because both OpenMPI and MPICH like to provide wrappers for compilers, one needs to have available the correct wrapper for the compiler one wishes to use. The names of the compilers are mpifort, mpifx, mpgfortran, mpflang-amd, mpnvfortran and mpnagfor for Fortran, mpicc, mpicx, mpgcc, mpclang and mpnvc for C, and mpicpc, mpicpx, mpg++, mpclang and mpnvc++ for C++. It is not necessary to specify -lmpi on the command line - all required MPI libraries should be linked automatically. Shared MPI libraries will not be used by default. Not all compilers are installed for all MPI versions.

Setting TCM_INTEL_VER to set the Intel compiler version works as usual.

Availability

This changes almost continuously. As of March 2024 we have (amongst others):

OpenMPI-5.0.2: Intel and Gnu compilers
OpenMPI-4.1.4: Intel, Gnu, NAG, AMD and Nvidia compilers
OpenMPI-4.0.5: Intel, Gnu and NAG compilers
OpenMPI-3.1.6: Intel, Gnu and NAG compilers

Intel-2021.1.1: Intel and Gnu compilers

MPICH-4.2.0: Intel and Gnu compilers
MPICH-4.1.3: Intel, Gnu and NAG compilers
MPICH-4.0.3: Intel, Gnu and NAG compilers
MPICH-3.2.1: Intel and Gnu compilers

Note that the mpi_f08 module is not available for the NAG compiler except with MPICH 4.0 and 4.1 (but not 4.2).

The current situation can be determined by looking in the directory /usr/local/shared/MPI.

To switch between these, set the environment variable MPI to one of the above, e.g.

export MPI=OpenMP-3.1.4
mpifort test.f90

MPICH-3.2 with the Gnu compilers seems to require specifying one extra library: -lpthread.

Running the Result

If a non-default version of MPI was used, one can set the MPI environment variable appropriately, so that the correct mpirun or mpiexec command is used. Else mpirun and mpiexec attempt to to choose the correct version automatically if MPI is not set. Note that MPICH uses exclusively mpiexec, whereas OpenMPI supports both.

Documentation

There exist man pages, viewable with the mpiman command, which will display those pages relevant to the current setting of the MPI environment variable.

Tutorials Etc.

I would recommend this course, but that is hardly surprising...

CMA

Does using Cross Memory Attach make your code go faster? In theory, yes. In practice, sometimes not. OpenMPI from 2.1.0 defaults to using it. To control it at runtime, use:

  mpirun -mca btl_vader_single_copy_mechanism none ./a.out
  mpirun -mca btl_vader_single_copy_mechanism cma ./a.out

Example

The code used below is:

program version
  use mpi
  integer major,minor,nproc,rank,ierr

  call mpi_init(ierr)
  call mpi_comm_size(mpi_comm_world, nproc, ierr)
  call mpi_comm_rank(mpi_comm_world, rank, ierr)
  write(*,101) rank,nproc
101   format(' Hello F90 parallel world, I am process ', &
        I3,' out of ',I3)

  if (rank.eq.0) then
    call mpi_get_version(major,minor,ierr)
    write(*,102) major,minor
102   format(' Run time MPI version is ',I1,'.',I1)
  endif

  call mpi_finalize(ierr)
  end

And the result of trying to compile and run it is:

m1:/tmp$ mpifort version.f90 
m1:/tmp$ mpirun -np 2 ./a.out 
 Hello F90 parallel world, I am process   0 out of   2
 Hello F90 parallel world, I am process   1 out of   2
 Run time MPI version is 2.1
m1:/tmp$ MPI=OpenMPI-1.10.1 mpifort version.f90 
m1:/tmp$ mpirun -np 2 ./a.out 
[m1:14717] [[3050,0],0] mca_oob_tcp_recv_handler: invalid message type: 15
[m1:14717] [[3050,0],0] mca_oob_tcp_recv_handler: invalid message type: 15
m1:/tmp$ MPI=OpenMPI-1.10.1 mpirun -np 2 ./a.out 
 Hello F90 parallel world, I am process   0 out of   2
 Run time MPI version is 3.0
 Hello F90 parallel world, I am process   1 out of   2
m1:/tmp$ export MPI=MPICH-3.2.0
m1:/tmp$ mpifort version.f90 
m1:/tmp$ mpiexec -n 3 ./a.out 
 Hello F90 parallel world, I am process   1 out of   3
 Hello F90 parallel world, I am process   0 out of   3
 Run time MPI version is 3.1
 Hello F90 parallel world, I am process   2 out of   3