Re: [OMPI users] Open MPI internal error

2017-09-28 Thread Ludovic Raess
9256f461b%7C0%7C0%7C636422080954625054&sdata=ydWEy8pvVUbwLw31V3dUS2ruDcCa3sPmQV4KSYZUSeQ%3D&reserved=0> On 28 September 2017 at 01:26, Ludovic Raess mailto:ludovic.ra...@unil.ch>> wrote: Hi, we have a issue on our 32 nodes Linux cluster regarding the usage of Open MPI in a Infiniband dual-ra

[OMPI users] Open MPI internal error

2017-09-27 Thread Ludovic Raess
Hi, we have a issue on our 32 nodes Linux cluster regarding the usage of Open MPI in a Infiniband dual-rail configuration (2 IB Connect X FDR single port HCA, Centos 6.6, OFED 3.1, openmpi 2.0.0, gcc 5.4, cuda 7). On long runs (over ~10 days) involving more than 1 node (usually 64 MPI proces

[OMPI users] MPI vendor error

2017-09-15 Thread Ludovic Raess
Hi, we have a issue on our 32 nodes Linux cluster regarding the usage of Open MPI in a Infiniband dual-rail configuration (2 IB Connect X FDR single port HCA, Centos 6.6, OFED 3.1, openmpi 2.0.0, gcc 5.4, cuda 7). On long runs (over ~10 days) involving more than 1 node (usually 64 MPI proces

[OMPI users] Open MPI in a Infiniband dual-rail configuration issues

2017-07-19 Thread Ludovic Raess
Hi, We have an issue on our 32 nodes Linux cluster regarding the usage of Open MPI in a Infiniband dual-rail configuration. Node config: - Supermicro dual socket Xeon E5 v3 6 cores CPUs - 4 Titan X GPUs - 2 IB Connect X FDR single port HCA (mlx4_0 and mlx4_1) - Centos 6.6, OFED 3.1, openmpi 2.0.