Hi, I am currently conducting some testing on system with Gigabit and InfiniBand interconnects. Both Latency and Bandwidth benchmarks are doing well as expected on InfiniBand interconnects but Ethernet interconnect is achieving very high performance from expectations. Ethernet and InfiniBand both are achieving equivalent performance.
For some reason, it looks like openmpi (v1.8.1) is using the InfiniBand interconnect rather than the Gigabit or TCP communication is being emulated to use InifiniBand interconnect. Here are Latency and Bandwidth benchmark results. #--------------------------------------------------- # Benchmarking PingPong # processes = 2 # map-by node #--------------------------------------------------- Hello, world. I am 1 on node124 Hello, world. I am 0 on node123 Size Latency (usec) Bandwidth (Mbps) 1 1.65 4.62 2 1.67 9.16 4 1.66 18.43 8 1.66 36.74 16 1.85 66.00 32 1.83 133.28 64 1.83 266.36 128 1.88 519.10 256 1.99 982.29 512 2.23 1752.37 1024 2.58 3026.98 2048 3.32 4710.76 I read some of the FAQs and noted that OpenMPI prefers the faster available interconnect. In an effort to force it to use the gigabit interconnect I ran it as follows 1. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca btl_tcp_if_include em1 ./latency.ompi 2. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp,self,sm --mca btl_tcp_if_include em1 ./latency.ompi 3. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib --mca btl_tcp_if_include em1 ./latency.ompi 4. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib ./latency.ompi None of them resulted in a significantly different benchmark output. I am using OpenMPI by loading module on clustered environment and don't have admin access. It is configured for both TCP and OpenIB (confirmed from ompi_info). After trying all above mentioned methods without success I installed OpenMPI v1.8.2 in my home directory and disable openib with following configuration options --disable-openib-control-hdr-padding --disable-openib-dynamic-sl --disable-openib-connectx-xrc --disable-openib-udcm --disable-openib-rdmacm --disable-btl-openib-malloc-alignment --disable-io-romio --without-openib --without-verbs Now, openib is not enabled (confirmed from ompi_info script) and there is no "openib.so" file in $prefix/lib/openmpi directory as well. Still, above mentioned mpirun commands are getting the same latency and bandwidth as that of InfiniBand. I tried mpirun in verbose mode with following command and here is the output Command: mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca btl_base_verbose 30 --mca btl_tcp_if_include em1 ./latency.ompi Output: [node123.prv.sciama.cluster:88310] mca: base: components_register: registering btl components [node123.prv.sciama.cluster:88310] mca: base: components_register: found loaded component tcp [node123.prv.sciama.cluster:88310] mca: base: components_register: component tcp register function successful [node123.prv.sciama.cluster:88310] mca: base: components_open: opening btl components [node123.prv.sciama.cluster:88310] mca: base: components_open: found loaded component tcp [node123.prv.sciama.cluster:88310] mca: base: components_open: component tcp open function successful [node124.prv.sciama.cluster:90465] mca: base: components_register: registering btl components [node124.prv.sciama.cluster:90465] mca: base: components_register: found loaded component tcp [node124.prv.sciama.cluster:90465] mca: base: components_register: component tcp register function successful [node124.prv.sciama.cluster:90465] mca: base: components_open: opening btl components [node124.prv.sciama.cluster:90465] mca: base: components_open: found loaded component tcp [node124.prv.sciama.cluster:90465] mca: base: components_open: component tcp open function successful Hello, world. I am 1 on node124 Hello, world. I am 0 on node123 Size Latency(usec) Bandwidth(Mbps) 1 4.18 1.83 2 3.66 4.17 4 4.08 7.48 8 3.12 19.57 16 3.83 31.84 32 3.40 71.84 64 4.10 118.97 128 3.89 251.19 256 4.22 462.77 512 2.95 1325.71 1024 2.63 2969.49 2048 3.38 4628.29 [node123.prv.sciama.cluster:88310] mca: base: close: component tcp closed [node123.prv.sciama.cluster:88310] mca: base: close: unloading component tcp [node124.prv.sciama.cluster:90465] mca: base: close: component tcp closed [node124.prv.sciama.cluster:90465] mca: base: close: unloading component tcp Moreover, same benchmark applications using MPICH are working fine on Ethernet and achieving expected Latency and Bandwidth. How can this be fixed? Thanks for help, --Ansar