Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

Jeff Squyres Wed, 21 Sep 2011 15:28:37 -0400

On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote:

> Meanwhile, I contacted some people at SciNet, which is also part of Compute 
> Canada. 
> 
> They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl 
> self,ofud to use the ofud BTL instead of openib for OpenFabrics transport.
> 
> This worked quite good -- I got a low latency of 35 microseconds. Yay !


That's still pretty terrible.

Per your comments below, yes, ofud was never finished.  I believe it doesn't 
have retransmission code in there, so if anything is dropped by the network 
(which, in a congested/busy network, there will be drops), the job will likely 
hang.

The ofud and openib BTLs should have similar latencies.  Indeed, openib should 
actually have slightly lower HRT ping-pong latencies because of protocol and 
transport differences between the two.

The openib BTL should give about the same latency as the ibv_rc_pingpong, which 
you cited at about 11 microseconds (I assume there must be multiple hops in 
that IB network to be that high), which jives with your "only 1 process sends" 
RAY network test (http://pastebin.com/dWMXsHpa).

It's not uncommon for latency to go up if multiple processes are all banging on 
the HCA, but it shouldn't go up noticeably if there's only 2 processes on each 
node doing simple ping-pong tests, for example.

What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
ibv_rc_pingpongs?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

Reply via email to