Thanks, it's at least good to know that the behaviour isn't normal!
Could it be some sort of memory leak in the call? The code in
ompi/runtime/ompi_mpi_preconnect.c
looks reasonably safe, though maybe doing thousands of of isend/irecv
pairs is causing problems with the buffer used in ptp mes
Hi Sir,
We are using cuda6.0 release and the 331.89 driver…
Little background, the master does not init CUDA. We have tried this method
too, having all five processes init cuda but it seems to cause the problem more
easily.
Yes the example below was on one machine, but we have seen it even
Hi:
I just tried running a program similar to yours with CUDA 6.5 and Open MPI and
I could not reproduce. Just to make sure I am doing things correctly, your
example below is running with np=5 and on a single node? Which version of CUDA
are you using? Can you also send the output from nvidia-s
Thanks for your quick response,
1)mpiexec --allow-run-as-root --mca btl_openib_want_cuda_gdr 1 --mca
btl_openib_cuda_rdma_limit 6 --mca mpi_common_cuda_event_max 1000 -n 5
test/RunTests
2)Yes, cuda aware support using Mellanox IB,
3)Yes, we have the ability to use several version of OpenMPI,