Re: [OMPI users] CUDA-aware MPI_Reduce problem in Openmpi 1.8.5

Rolf vandeVaart Wed, 17 Jun 2015 13:38:33 -0400 (EDT)

Hi Fei:

The reduction support for CUDA-aware in Open MPI is rather simple.  The GPU 
buffers are copied into temporary host buffers and then the reduction is done 
with the host buffers.  At the completion of the host reduction, the data is 
copied back into the GPU buffers.  So, there is no use of CUDA IPC or GPU 
Direct RDMA in the reduction.

Rolf

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Fei Mao
Sent: Wednesday, June 17, 2015 1:08 PM
To: us...@open-mpi.org
Subject: [OMPI users] CUDA-aware MPI_Reduce problem in Openmpi 1.8.5

Hi there,

I am doing benchmarks on a GPU cluster with two CPU sockets and 4 K80 GPUs each 
node. Two K80 are connected with CPU socket 0, another two with socket 1. An IB 
ConnectX-3 (FDR) is also under socket 1. We are using Linux's OFED, so I know 
there is no way to do GPU RDMA inter-node communication. I can do intra-node 
IPC for MPI_Send and MPI_Receive with two K80 (4 GPUs in total) which are 
connected under same socket (PCI-e switch). So I thought I could do intra-node 
MPI_Reduce with IPC support in openmpi 1.8.5.

The benchmark I was using is osu-micro-benchmarks-4.4.1, and I got the same 
results when I use two GPU under the same socket or different socket. The 
result was the same even I used two GPUs in different nodes.

Does MPI_Reduce use IPC for intra-node? Should I have to install Mellanox OFED 
stack to support GPU RDMA reduction on GPUs even they are under with the same 
PCI-e switch?

Thanks,

Fei Mao
High Performance Computing Technical Consultant
SHARCNET | http://www.sharcnet.ca<http://www.sharcnet.ca/>
Compute/Calcul Canada | http://www.computecanada.ca

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Re: [OMPI users] CUDA-aware MPI_Reduce problem in Openmpi 1.8.5

Reply via email to