[OMPI users] How does MPI_Allreduce work?

Yang Zhang Thu, 24 Sep 2015 23:41:22 -0400 (EDT)

Hello OpenMPI users,

Is there any document on MPI_Allreduce() implementation? I’m using
it to do summation on GPU data. I wonder if OpenMPI will first do
summation on processes in the same node, and then do summation on the
intermediate results across nodes. This would be preferable since it
reduces cross node communication and should be faster?


I’m using OpenMPI 1.10.0 and CUDA 7.0. I need to sum 40 million float
numbers on 6 nodes, each node running 4 processes. The nodes are
connected via InfiniBand.

Thanks very much!

Best,
Yang

------------------------------------------------------------------------

Sent by Apple Mail

Yang ZHANG

PhD candidate

Networking and Wide-Area Systems Group
Computer Science Department
New York University

715 Broadway Room 705
New York, NY 10003

[OMPI users] How does MPI_Allreduce work?

Reply via email to