Re: [OMPI users] Stream interactions in CUDA

2012-12-13 Thread Shamis, Pavel
vel. Thanks, Justin From: users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org> [users-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart [rvandeva...@nvidia.com] Sent: Thursday, December 13, 2012 6:18 AM To: Open MPI Users Subject: Re: [OMPI users] S

Re: [OMPI users] Stream interactions in CUDA

2012-12-13 Thread Justin Luitjens
t;From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] >On Behalf Of Jens Glaser >Sent: Wednesday, December 12, 2012 8:12 PM >To: Open MPI Users >Subject: Re: [OMPI users] Stream interactions in CUDA > >Hi Justin > >from looking at your code it seems you are

Re: [OMPI users] Stream interactions in CUDA

2012-12-13 Thread Rolf vandeVaart
.@open-mpi.org] >On Behalf Of Jens Glaser >Sent: Wednesday, December 12, 2012 8:12 PM >To: Open MPI Users >Subject: Re: [OMPI users] Stream interactions in CUDA > >Hi Justin > >from looking at your code it seems you are receiving more bytes from the >processors then you se

Re: [OMPI users] Stream interactions in CUDA

2012-12-12 Thread Jens Glaser
Hi Justin from looking at your code it seems you are receiving more bytes from the processors then you send (I assume MAX_RECV_SIZE_PER_PE > send_sizes[p]). I don't think this is valid. Your transfers should have matched sizes on the sending and receiving side. To achieve this, either communicat

Re: [OMPI users] Stream interactions in CUDA

2012-12-12 Thread Dmitry N. Mikushin
Hi Justin, Quick grepping reveals several cuMemcpy calls in OpenMPI. Some of them are even synchronous, meaning stream0. I think the best way of exploring this sort of behavior is to execute OpenMPI runtime (thanks to its open-source nature!) under debugger. Rebuild OpenMPI with -g -O0, add some