[OMPI users] Pass RoCE flags to srun under SLURM

Davide Vanzo Thu, 3 Mar 2016 17:40:58 -0500 (EST)

Hi all,
In our cluster the nodes are interconnected with RoCE and I want to set
up OpenMPI to run on it via SLURM.
I initially compiled OpenMPI 1.10.2 only with IB verbs support and I
have no problem making it run over RoCE.
Then I have successfully built it with SLURM support as follows:


./configure --with-slurm --with-pmi=/usr/scheduler/slurm --with-verbs -
-with-hwloc

The problem is that I cannot let it use the RoCE network when I'm using
srun. I also tried to export the OpenMPI runtime options but still I
cannot correctly initialize the network:

$ echo $OMPI_MCA_btl
openib,self,sm
$ echo $OMPI_MCA_btl_openib_cpc_include 
rdmacm
$ srun -n 2 --mpi=pmi2 ./osu_latency
---------------------------------------------------------------------
-----
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           test-vmp1245
  Local device:         mlx4_0
  Local port:           2
  CPCs attempted:       udcm
---------------------------------------------------------------------
-----
---------------------------------------------------------------------
-----
No OpenFabrics connection schemes reported that they were able to be
used on a specific port.  As such, the openib BTL (OpenFabrics
support) will be disabled for this port.

  Local host:           test-vmp1244
  Local device:         mlx4_0
  Local port:           2
  CPCs attempted:       udcm
---------------------------------------------------------------------
-----
---------------------------------------------------------------------
-----
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[27,4],0]) is on host: test-vmp1244
  Process 2 ([[27,4],1]) is on host: test-vmp1245
  BTLs attempted: self

Your MPI job is now going to abort; sorry.
---------------------------------------------------------------------
-----
---------------------------------------------------------------------
-----
MPI_INIT has failed because at least one MPI process is unreachable
from another.  This *usually* means that an underlying communication
plugin -- such as a BTL or an MTL -- has either not loaded or not
allowed itself to be used.  Your MPI job will now abort.

You may wish to try to narrow down the problem;

 * Check the output of ompi_info to see which BTL/MTL plugins are
   available.
 * Run your application with MPI_THREAD_SINGLE.
 * Set the MCA parameter btl_base_verbose to 100 (or mtl_base_verbose,
   if using MTL-based communications) to see exactly which
   communication plugins were considered and/or discarded.
---------------------------------------------------------------------
-----
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now
abort,
***    and potentially your MPI job)
[test-vmp1245:3603] Local abort before MPI_INIT completed successfully;
not able to aggregate error messages, and not able to guarantee that
all other processes were killed!
srun: error: test-vmp1244: task 0: Exited with exit code 1
srun: error: test-vmp1245: task 1: Exited with exit code 1

Any suggestion?
Thanks!

Davide

[OMPI users] Pass RoCE flags to srun under SLURM

Reply via email to