In the case of reductions, yes, we copy into host memory so we can do the 
reduction.  For other collectives or point to point communication, then GPU 
Direct RDMA will be used (for smaller messages).

Rolf

>-----Original Message-----
>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Yang Zhang
>Sent: Friday, September 25, 2015 11:37 AM
>To: Open MPI Users
>Subject: Re: [OMPI users] How does MPI_Allreduce work?
>
>Hi Rolf,
>
>Thanks very much for the info! So with CUDA-aware build, OpenMPI still have
>to copy all the data first into host memory, and then do send/recv on the host
>memory? I thought OpenMPI would use GPUdirect and RDMA to send/recv
>GPU memory directly.
>
>I will try a debug build and see what does it say. Thanks!
>
>Best,
>Yang
>
>------------------------------------------------------------------------
>
>Sent by Apple Mail
>
>Yang ZHANG
>
>PhD candidate
>
>Networking and Wide-Area Systems Group
>Computer Science Department
>New York University
>
>715 Broadway Room 705
>New York, NY 10003
>
>> On Sep 25, 2015, at 11:07 AM, Rolf vandeVaart <rvandeva...@nvidia.com>
>wrote:
>>
>> Hello Yang:
>> It is not clear to me if you are asking about a CUDA-aware build of Open MPI
>where you do the MPI_Allreduce() or the GPU buffer or if you are handling
>staging the GPU into host memory and then calling the MPI_Allreduce().
>Either way, they are somewhat similar.  With CUDA-aware, the
>MPI_Allreduce() of GPU data simply first copies the data into a host buffer
>and then calls the underlying implementation.
>>
>> Depending on how you have configured your Open MPI, the underlying
>implementation may vary.  I would suggest you compile a debug version (--
>enable-debug) and then run some tests with --mca coll_base_verbose 100
>which will give you some insight into what is actually happening under the
>covers.
>>
>> Rolf
>>
>>> -----Original Message-----
>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Yang
>>> Zhang
>>> Sent: Thursday, September 24, 2015 11:41 PM
>>> To: us...@open-mpi.org
>>> Subject: [OMPI users] How does MPI_Allreduce work?
>>>
>>> Hello OpenMPI users,
>>>
>>> Is there any document on MPI_Allreduce() implementation? I’m using it
>>> to do summation on GPU data. I wonder if OpenMPI will first do
>>> summation on processes in the same node, and then do summation on the
>>> intermediate results across nodes. This would be preferable since it
>>> reduces cross node communication and should be faster?
>>>
>>> I’m using OpenMPI 1.10.0 and CUDA 7.0. I need to sum 40 million float
>>> numbers on 6 nodes, each node running 4 processes. The nodes are
>>> connected via InfiniBand.
>>>
>>> Thanks very much!
>>>
>>> Best,
>>> Yang
>>>
>>> ---------------------------------------------------------------------
>>> ---
>>>
>>> Sent by Apple Mail
>>>
>>> Yang ZHANG
>>>
>>> PhD candidate
>>>
>>> Networking and Wide-Area Systems Group Computer Science Department
>>> New York University
>>>
>>> 715 Broadway Room 705
>>> New York, NY 10003
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: http://www.open-
>>> mpi.org/community/lists/users/2015/09/27675.php
>>
>> ----------------------------------------------------------------------
>> ------------- This email message is for the sole use of the intended
>> recipient(s) and may contain confidential information.  Any
>> unauthorized review, use, disclosure or distribution is prohibited.
>> If you are not the intended recipient, please contact the sender by
>> reply email and destroy all copies of the original message.
>> ----------------------------------------------------------------------
>> ------------- _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/09/27678.php
>
>_______________________________________________
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: http://www.open-
>mpi.org/community/lists/users/2015/09/27679.php

Reply via email to