Hi Rob: Sorry for the slow reply but it took me a while to figure this out. It turns out that this issue had to do with how some of the memory within the smcuda BTL was being registered with CUDA. This was fixed a few weeks ago and will be available in the 1.8.5 release. Perhaps you could retry with a pre-release version of Open MPI 1.8.5 that is available here and confirm it fixes your issue. Any of the ones listed on that page should be fine.
http://www.open-mpi.org/nightly/v1.8/ Thanks, Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart Sent: Wednesday, February 11, 2015 3:50 PM To: Open MPI Users Subject: Re: [OMPI users] GPUDirect with OpenMPI Let me try to reproduce this. This should not have anything to do with GPU Direct RDMA. However, to eliminate it, you could run with: --mca btl_openib_want_cuda_gdr 0. Rolf From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Aulwes, Rob Sent: Wednesday, February 11, 2015 2:17 PM To: us...@open-mpi.org<mailto:us...@open-mpi.org> Subject: [OMPI users] GPUDirect with OpenMPI Hi, I built OpenMPI 1.8.3 using PGI 14.7 and enabled CUDA support for CUDA 6.0. I have a Fortran test code that tests GPUDirect and have included it here. When I run it across 2 nodes using 4 MPI procs, sometimes it fails with incorrect results. Specifically, sometimes rank 1 does not receive the correct value from one of the neighbors. The code was compiled using PGI 14.7: mpif90 -o direct.x -acc acc_direct.f90 and executed with: mpirun -np 4 -npernode 2 -mca btl_openib_want_cudagdr 1 ./direct.x Does anyone know if I'm missing something when using GPUDirect? Thanks,Rob Aulwes program acc_direct include 'mpif.h' integer :: ierr, rank, nranks integer, dimension(:), allocatable :: i_ra call mpi_init(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) rank = rank + 1 write(*,*) 'hello from rank ',rank call MPI_COMM_SIZE(MPI_COMM_WORLD, nranks, ierr) allocate( i_ra(nranks) ) call nb_exchange call mpi_finalize(ierr) contains subroutine nb_exchange integer :: i, j integer, dimension(nranks - 1) :: sendreq, recvreq logical :: done integer :: stat(MPI_STATUS_SIZE) i_ra = -1 i_ra(rank) = rank !$acc data copy(i_ra(1:nranks)) !$acc host_data use_device(i_ra) cnt = 0 do i = 1,nranks if ( i .ne. rank ) then cnt = cnt + 1 call MPI_ISEND(i_ra(rank), 1, MPI_INTEGER, i - 1, rank, & MPI_COMM_WORLD, sendreq(cnt), ierr) if ( ierr .ne. MPI_SUCCESS ) write(*,*) 'isend call failed.' call MPI_IRECV(i_ra(i), 1, MPI_INTEGER, i - 1, i, & MPI_COMM_WORLD, recvreq(cnt), ierr) if ( ierr .ne. MPI_SUCCESS ) write(*,*) 'irecv call failed.' endif enddo !$acc end host_data i = 0 do while ( i .lt. 2*cnt ) do j = 1, cnt if ( recvreq(j) .ne. MPI_REQUEST_NULL ) then call MPI_TEST(recvreq(j), done, stat, ierr) if ( ierr .ne. MPI_SUCCESS ) & write(*,*) 'test for irecv call failed.' if ( done ) then i = i + 1 endif endif if ( sendreq(j) .ne. MPI_REQUEST_NULL ) then call MPI_TEST(sendreq(j), done, MPI_STATUS_IGNORE, ierr) if ( ierr .ne. MPI_SUCCESS ) & write(*,*) 'test for irecv call failed.' if ( done ) then i = i + 1 endif endif enddo enddo write(*,*) rank,': nb_exchange: Updating host...' !$acc update host(i_ra(1:nranks)) do j = 1, nranks if ( i_ra(j) .ne. j ) then write(*,*) 'isend/irecv failed.' write(*,*) 'rank', rank,': i_ra(',j,') = ',i_ra(j) endif enddo !$acc end data end subroutine end program ________________________________ This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ________________________________