Hi,

I am currently conducting some testing on system with Gigabit and
InfiniBand interconnects. Both Latency and Bandwidth benchmarks are doing
well as expected on InfiniBand interconnects but Ethernet interconnect is
achieving very high performance from expectations. Ethernet and InfiniBand
both are achieving equivalent performance.

For some reason, it looks like openmpi (v1.8.1) is using the InfiniBand
interconnect rather than the Gigabit or TCP communication is being emulated
to use InifiniBand interconnect.

Here are Latency and Bandwidth benchmark results.
#---------------------------------------------------
# Benchmarking PingPong
# processes = 2
# map-by node
#---------------------------------------------------

Hello, world.  I am 1 on node124
Hello, world.  I am 0 on node123
Size Latency (usec) Bandwidth (Mbps)
1    1.65    4.62
2    1.67    9.16
4    1.66    18.43
8    1.66    36.74
16    1.85    66.00
32    1.83    133.28
64    1.83    266.36
128    1.88    519.10
256    1.99    982.29
512    2.23    1752.37
1024    2.58    3026.98
2048    3.32    4710.76

I read some of the FAQs and noted that OpenMPI prefers the faster available
interconnect. In an effort to force it to use the gigabit interconnect I
ran it as follows

1. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca
btl_tcp_if_include em1 ./latency.ompi
2. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp,self,sm
--mca btl_tcp_if_include em1 ./latency.ompi
3. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib --mca
btl_tcp_if_include em1 ./latency.ompi
4. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib
./latency.ompi

None of them resulted in a significantly different benchmark output.

I am using OpenMPI by loading module on clustered environment and don't
have admin access. It is configured for both TCP and OpenIB (confirmed from
ompi_info). After trying all above mentioned methods without success I
installed OpenMPI v1.8.2 in my home directory and disable openib with
following configuration options

--disable-openib-control-hdr-padding --disable-openib-dynamic-sl
--disable-openib-connectx-xrc --disable-openib-udcm
--disable-openib-rdmacm  --disable-btl-openib-malloc-alignment
--disable-io-romio --without-openib --without-verbs

Now, openib is not enabled (confirmed from ompi_info script) and there is
no "openib.so" file in $prefix/lib/openmpi directory as well. Still, above
mentioned mpirun commands are getting the same latency and bandwidth as
that of InfiniBand.

I tried mpirun in verbose mode with following command and here is the output

Command:
mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca
btl_base_verbose 30 --mca btl_tcp_if_include em1 ./latency.ompi

Output:
[node123.prv.sciama.cluster:88310] mca: base: components_register:
registering btl components
[node123.prv.sciama.cluster:88310] mca: base: components_register: found
loaded component tcp
[node123.prv.sciama.cluster:88310] mca: base: components_register:
component tcp register function successful
[node123.prv.sciama.cluster:88310] mca: base: components_open: opening btl
components
[node123.prv.sciama.cluster:88310] mca: base: components_open: found loaded
component tcp
[node123.prv.sciama.cluster:88310] mca: base: components_open: component
tcp open function successful
[node124.prv.sciama.cluster:90465] mca: base: components_register:
registering btl components
[node124.prv.sciama.cluster:90465] mca: base: components_register: found
loaded component tcp
[node124.prv.sciama.cluster:90465] mca: base: components_register:
component tcp register function successful
[node124.prv.sciama.cluster:90465] mca: base: components_open: opening btl
components
[node124.prv.sciama.cluster:90465] mca: base: components_open: found loaded
component tcp
[node124.prv.sciama.cluster:90465] mca: base: components_open: component
tcp open function successful
Hello, world.  I am 1 on node124
Hello, world.  I am 0 on node123
Size Latency(usec) Bandwidth(Mbps)
1    4.18    1.83
2    3.66    4.17
4    4.08    7.48
8    3.12    19.57
16    3.83    31.84
32    3.40    71.84
64    4.10    118.97
128    3.89    251.19
256    4.22    462.77
512    2.95    1325.71
1024    2.63    2969.49
2048    3.38    4628.29
[node123.prv.sciama.cluster:88310] mca: base: close: component tcp closed
[node123.prv.sciama.cluster:88310] mca: base: close: unloading component tcp
[node124.prv.sciama.cluster:90465] mca: base: close: component tcp closed
[node124.prv.sciama.cluster:90465] mca: base: close: unloading component tcp

Moreover, same benchmark applications using MPICH are working fine on
Ethernet and achieving expected Latency and Bandwidth.

How can this be fixed?

Thanks for help,

--Ansar

Reply via email to