Re: [OMPI users] Open MPI in a Infiniband dual-rail configuration issues

2017-07-19 Thread Gilles Gouaillardet
Ludovic, what happens here is that by default, a MPI task will only use the closest IB device. since tasks are bound to a socket, that means that tasks on socket 0 will only use mlx4_0, and tasks on socket 1 will only use mlx4_1. because these are on independent subnets, that also means that tasks

[OMPI users] Open MPI in a Infiniband dual-rail configuration issues

2017-07-19 Thread Ludovic Raess
Hi, We have an issue on our 32 nodes Linux cluster regarding the usage of Open MPI in a Infiniband dual-rail configuration. Node config: - Supermicro dual socket Xeon E5 v3 6 cores CPUs - 4 Titan X GPUs - 2 IB Connect X FDR single port HCA (mlx4_0 and mlx4_1) - Centos 6.6, OFED 3.1, openmpi 2.0.