On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote:

>> What happens if you run 2 ibv_rc_pingpong's on each node?  Or N 
>> ibv_rc_pingpongs?
> 
> With 11 ibv_rc_pingpong's
> 
> http://pastebin.com/85sPcA47
> 
> Code to do that => https://gist.github.com/1233173
> 
> Latencies are around 20 microseconds.

This seems to imply that the network is to blame for the higher latency...?

I.e., if you run the same pattern with MPI processes and get 20us latency, that 
would tend to imply that the network itself is not performing well with that IO 
pattern.

> My job seems to do well so far with ofud !
> 
> [sboisver12@colosse2 ray]$ qstat
> job-ID  prior   name       user         state submit/start at     queue       
>                    slots ja-task-ID 
> -----------------------------------------------------------------------------------------------------------------
> 3047460 0.55384 fish-Assem sboisver12   r     09/21/2011 15:02:25 
> med@r104-n58                     256   

I would still be suspicious -- ofud is not well tested, and it can definitely 
hang if there are network drops.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to