On Friday 09 July 2010, Andreas Schäfer wrote: > Hi, > > I'm evaluating Open MPI 1.4.2 on one of our BladeCenters and I'm > getting via InfiniBand about 1550 MB/s and via shared memory about > 1770 for the PingPong benchmark in Intel's MPI benchmark. (That > benchmark is just an example, I'm seeing similar numbers for my own > codes.)
Two factors that make a big difference, size of the operations and type of node (cpu model). On an E5520 (nehalem) node I get ~5 GB/s ping-pong for >64K sizes. On QDR IB on similar nodes I get ~3 GB/s ping-pong for >256K. Numbers are for 1.4.1 YMMV. I couldn't find an AMD node similar to yours, sorry. /Peter > Each node has two AMD hex-cores and two 40 Gbps InfiniBand ports, so I > wonder if I shouldn't be getting a significantly higher throughput on > InfiniBand. Considering the CPUs' memory bandwidth, I believe that > shared memory throughput should be much higher as well. > > Are those numbers what is to be expected? If not: any ideas how to > debug this or tune Open MPI? > > Thanks in advance > -Andreas > > ps: if it's any help, this is what iblinkinfo is telling me > (tests were run on faui36[bc])
signature.asc
Description: This is a digitally signed message part.