vel.
Thanks,
Justin
From: users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>
[users-boun...@open-mpi.org] On Behalf Of Rolf vandeVaart
[rvandeva...@nvidia.com]
Sent: Thursday, December 13, 2012 6:18 AM
To: Open MPI Users
Subject: Re: [OMPI users] S
t;From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
>On Behalf Of Jens Glaser
>Sent: Wednesday, December 12, 2012 8:12 PM
>To: Open MPI Users
>Subject: Re: [OMPI users] Stream interactions in CUDA
>
>Hi Justin
>
>from looking at your code it seems you are
.@open-mpi.org]
>On Behalf Of Jens Glaser
>Sent: Wednesday, December 12, 2012 8:12 PM
>To: Open MPI Users
>Subject: Re: [OMPI users] Stream interactions in CUDA
>
>Hi Justin
>
>from looking at your code it seems you are receiving more bytes from the
>processors then you se
Hi Justin
from looking at your code it seems you are receiving more bytes from the
processors then you send (I assume MAX_RECV_SIZE_PER_PE > send_sizes[p]).
I don't think this is valid. Your transfers should have matched sizes on the
sending and receiving side. To achieve this, either communicat
Hi Justin,
Quick grepping reveals several cuMemcpy calls in OpenMPI. Some of them are
even synchronous, meaning stream0.
I think the best way of exploring this sort of behavior is to execute
OpenMPI runtime (thanks to its open-source nature!) under debugger. Rebuild
OpenMPI with -g -O0, add some