I've just compared IB p2p latency between version 1.6.5 and 1.8.8. I'm surprised to find that 1.8 is rather worse, as below. Assuming that's not expected, are there any suggestions for debugging it?
This is with FDR Mellanox, between two Sandybridge nodes on the same blade chassis switch. The results are similar for IMB pingpong and osu_latency, and reproducible. I'm running both cases the same way as far as I can tell (e.g. core binding with 1.6 and not turning it off with 1.8) just rebuilding the test against between OMPI versions. The initial osu_latency figures for 1.6 are: # OSU MPI Latency Test v5.0 # Size Latency (us) 0 1.16 1 1.24 2 1.23 4 1.23 8 1.26 16 1.27 32 1.30 64 1.36 and for 1.8: # OSU MPI Latency Test v5.0 # Size Latency (us) 0 1.48 1 1.46 2 1.42 4 1.43 8 1.46 16 1.47 32 1.48 64 1.54