Re: [OMPI users] silent failure for large allgather

2019-09-13 Thread Emmanuel Thomé via users
Hi, Thanks Jeff for your reply, and sorry for this late follow-up... On Sun, Aug 11, 2019 at 02:27:53PM -0700, Jeff Hammond wrote: > > openmpi-4.0.1 gives essentially the same results (similar files > > attached), but with various doubts on my part as to whether I've run this > > check correctly.

[OMPI users] silent failure for large allgather

2019-08-06 Thread Emmanuel Thomé via users
Hi, In the attached program, the MPI_Allgather() call fails to communicate all data (the amount it communicates wraps around at 4G...). I'm running on an omnipath cluster (2018 hardware), openmpi 3.1.3 or 4.0.1 (tested both). With the OFI mtl, the failure is silent, with no error message reporte

[OMPI users] pml ^ucx + mtl ofi (nonsensical ?) --> segfault at large sizes

2019-07-19 Thread Emmanuel Thomé via users
Hi, I came across this. openmpi-4.0.1 compiled with: ../openmpi-4.0.1/configure --disable-mpi-fortran --without-cuda --disable-opencl --with-ucx=/path/to/ucx-1.5.1 The execution of the attached program (simple mpi_send / mpi_recv pair) gives a segfault when the message size exceeds 2^30. I'm see