Hello Jeff. Thanks for pointing out NetPipe to me. I've played around with it a little in hope to see clear evidence/effect of message striping in OpenMPI. Unfortunately, what I saw is that the result of running NPmpi over several interconnects is identical to running it over a single fastest one :-( That was not the expected behavior, and I'm hoping that I'm doing something wrong. I'm using NetPIPE_3.6.2 over OMPI 1.1.4. NetPipe was compiled by making sure Open MPI's mpicc can be found and simply running 'make mpi' under NetPIPE_3.6.2 directory.
I experimented with 3 interconnects: openib, gm, and gig-e. Specifically, I found that the times (and, correspondingly, bandwidth) reported for openib+gm is pretty much identical to the times reported for just openib. Here are the commands I used to initiate the benchmark: # mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl openib,gm,self ~/NPmpi > ~/testdir/ompi/netpipe/ompi_netpipe_openib+gm.log 2>&1 # mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl openib,self ~/NPmpi
ompi_netpipe_openib.log 2>&1
Similarly, for tcp+gm the reported times were identical to just running the benchmark over gm alone. The commands were: # mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl tcp,gm,self --mca btl_tcp_if_exclude lo,ib0,ib1 ~/NPmpi # mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl gm,self ~/NPmpi Orthogonally, I've also observed that trying to use any combination of interconnects that includes openib (except using it exclusively) fails as soon as the benchmark reaches trials with 1.5MB message sizes. In fact the CPU load remained at 100% on the headnode, but no further output is sent to the log file or the screen (see the tails below). This behavior is fairly consistent and may be of interest to Open MPI development community. If anybody has tried using openib in combination with other interconnects please let me know what issues you've encountered and what tips and tricks you could share in this regard. Many thanks. Keep up the good work! Sincerely, Alex. Tails (the log file name reflects the combination of interconnects in that CL order): # tail ompi_netpipe_gm+openib.log 101: 786432 bytes 38 times --> 3582.46 Mbps in 1674.83 usec 102: 786435 bytes 39 times --> 3474.50 Mbps in 1726.87 usec 103: 1048573 bytes 19 times --> 3592.47 Mbps in 2226.87 usec 104: 1048576 bytes 22 times --> 3515.15 Mbps in 2275.86 usec 105: 1048579 bytes 21 times --> 3480.22 Mbps in 2298.71 usec 106: 1572861 bytes 21 times --> 4174.76 Mbps in 2874.41 usec 107: 1572864 bytes 23 times --> mpirun: killing job... # tail ompi_netpipe_openib+gm.log 100: 786429 bytes 45 times --> 3477.98 Mbps in 1725.13 usec 101: 786432 bytes 38 times --> 3578.94 Mbps in 1676.47 usec 102: 786435 bytes 39 times --> 3480.66 Mbps in 1723.82 usec 103: 1048573 bytes 19 times --> 3594.26 Mbps in 2225.76 usec 104: 1048576 bytes 22 times --> 3517.46 Mbps in 2274.37 usec 105: 1048579 bytes 21 times --> 3482.13 Mbps in 2297.45 usec 106: 1572861 bytes 21 times --> mpirun: killing job... # tail ompi_netpipe_openib+tcp+gm.log 100: 786429 bytes 45 times --> 3481.45 Mbps in 1723.41 usec 101: 786432 bytes 38 times --> 3575.83 Mbps in 1677.93 usec 102: 786435 bytes 39 times --> 3479.05 Mbps in 1724.61 usec 103: 1048573 bytes 19 times --> 3589.68 Mbps in 2228.61 usec 104: 1048576 bytes 22 times --> 3517.96 Mbps in 2274.05 usec 105: 1048579 bytes 21 times --> 3484.12 Mbps in 2296.14 usec 106: 1572861 bytes 21 times --> mpirun: killing job... # tail -5 ompi_netpipe_openib.log 119: 6291456 bytes 5 times --> 4036.63 Mbps in 11891.10 usec 120: 6291459 bytes 5 times --> 4005.81 Mbps in 11982.61 usec 121: 8388605 bytes 3 times --> 4033.78 Mbps in 15866.00 usec 122: 8388608 bytes 3 times --> 4025.50 Mbps in 15898.66 usec 123: 8388611 bytes 3 times --> 4017.58 Mbps in 15929.98 usec