Re: [OMPI users] Forcing OpenMPI to use Ethernet interconnect instead of InfiniBand

George Bosilca Tue, 9 Sep 2014 18:44:00 -0400 (EDT)

This is strange. I have a similar environment with one eth and one ipoib.
If I manually select the interface I want to use with TCP I get the
expected results.



Here is over IB:

mpirun -np 2 --mca btl tcp,self -host dancer00,dancer01 --mca
btl_tcp_if_include ib1 ./NPmpi
1: dancer01
0: dancer00
Now starting the main loop
  0:       1 bytes   3093 times -->      0.24 Mbps in      31.39 usec
  1:       2 bytes   3185 times -->      0.49 Mbps in      31.30 usec
  2:       3 bytes   3195 times -->      0.73 Mbps in      31.41 usec
  3:       4 bytes   2122 times -->      0.97 Mbps in      31.39 usec


And here the slightly slower eth0:

mpirun -np 2 --mca btl tcp,self -host dancer00,dancer01 --mca
btl_tcp_if_include eth0 ./NPmpi
0: dancer00
1: dancer01
Now starting the main loop
  0:       1 bytes   1335 times -->      0.13 Mbps in      60.55 usec
  1:       2 bytes   1651 times -->      0.28 Mbps in      53.62 usec
  2:       3 bytes   1864 times -->      0.45 Mbps in      51.29 usec
  3:       4 bytes   1299 times -->      0.61 Mbps in      50.36 usec


George.

On Wed, Sep 10, 2014 at 3:40 AM, Muhammad Ansar Javed <
muhammad.an...@seecs.edu.pk> wrote:

> Thanks George,
> I am selecting Ethernet device (em1) in mpirun script.
>
> Here is ifconfig output:
> em1       Link encap:Ethernet  HWaddr E0:DB:55:FD:38:46
>           inet addr:10.30.10.121  Bcast:10.30.255.255  Mask:255.255.0.0
>           inet6 addr: fe80::e2db:55ff:fefd:3846/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:1537270190 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:136123598 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:309333740659 (288.0 GiB)  TX bytes:143480101212 (133.6
> GiB)
>           Memory:91820000-91840000
>
> Ifconfig uses the ioctl access method to get the full address information,
> which limits hardware addresses to 8 bytes.
> Because Infiniband address has 20 bytes, only the first 8 bytes are
> displayed correctly.
> Ifconfig is obsolete! For replacement check ip.
> ib0       Link encap:InfiniBand  HWaddr
> 80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>           inet addr:10.32.10.121  Bcast:10.32.255.255  Mask:255.255.0.0
>           inet6 addr: fe80::211:7500:70:6ab4/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
>           RX packets:33621 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:365 errors:0 dropped:5 overruns:0 carrier:0
>           collisions:0 txqueuelen:256
>           RX bytes:1882728 (1.7 MiB)  TX bytes:21920 (21.4 KiB)
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:66889 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:66889 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:19005445 (18.1 MiB)  TX bytes:19005445 (18.1 MiB)
>
>
>
>
>
>
>> Date: Wed, 10 Sep 2014 00:06:51 +0900
>> From: George Bosilca <bosi...@icl.utk.edu>
>> To: Open MPI Users <us...@open-mpi.org>
>> Subject: Re: [OMPI users] Forcing OpenMPI to use Ethernet interconnect
>>         instead of InfiniBand
>>
>>
>> Look at your ifconfig output and select the Ethernet device (instead of
>> the
>> IPoIB one). Traditionally the name lack any fanciness, most distributions
>> using eth0 as a default.
>>
>>   George.
>>
>>
>> On Tue, Sep 9, 2014 at 11:24 PM, Muhammad Ansar Javed <
>> muhammad.an...@seecs.edu.pk> wrote:
>>
>> > Hi,
>> >
>> > I am currently conducting some testing on system with Gigabit and
>> > InfiniBand interconnects. Both Latency and Bandwidth benchmarks are
>> doing
>> > well as expected on InfiniBand interconnects but Ethernet interconnect
>> is
>> > achieving very high performance from expectations. Ethernet and
>> InfiniBand
>> > both are achieving equivalent performance.
>> >
>> > For some reason, it looks like openmpi (v1.8.1) is using the InfiniBand
>> > interconnect rather than the Gigabit or TCP communication is being
>> emulated
>> > to use InifiniBand interconnect.
>> >
>> > Here are Latency and Bandwidth benchmark results.
>> > #---------------------------------------------------
>> > # Benchmarking PingPong
>> > # processes = 2
>> > # map-by node
>> > #---------------------------------------------------
>> >
>> > Hello, world.  I am 1 on node124
>> > Hello, world.  I am 0 on node123
>> > Size Latency (usec) Bandwidth (Mbps)
>> > 1    1.65    4.62
>> > 2    1.67    9.16
>> > 4    1.66    18.43
>> > 8    1.66    36.74
>> > 16    1.85    66.00
>> > 32    1.83    133.28
>> > 64    1.83    266.36
>> > 128    1.88    519.10
>> > 256    1.99    982.29
>> > 512    2.23    1752.37
>> > 1024    2.58    3026.98
>> > 2048    3.32    4710.76
>> >
>> > I read some of the FAQs and noted that OpenMPI prefers the faster
>> > available interconnect. In an effort to force it to use the gigabit
>> > interconnect I ran it as follows
>> >
>> > 1. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca
>> > btl_tcp_if_include em1 ./latency.ompi
>> > 2. mpirun -np 2 -machinefile machines -map-by node --mca btl tcp,self,sm
>> > --mca btl_tcp_if_include em1 ./latency.ompi
>> > 3. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib
>> --mca
>> > btl_tcp_if_include em1 ./latency.ompi
>> > 4. mpirun -np 2 -machinefile machines -map-by node --mca btl ^openib
>> > ./latency.ompi
>> >
>> > None of them resulted in a significantly different benchmark output.
>> >
>> > I am using OpenMPI by loading module on clustered environment and don't
>> > have admin access. It is configured for both TCP and OpenIB (confirmed
>> from
>> > ompi_info). After trying all above mentioned methods without success I
>> > installed OpenMPI v1.8.2 in my home directory and disable openib with
>> > following configuration options
>> >
>> > --disable-openib-control-hdr-padding --disable-openib-dynamic-sl
>> > --disable-openib-connectx-xrc --disable-openib-udcm
>> > --disable-openib-rdmacm  --disable-btl-openib-malloc-alignment
>> > --disable-io-romio --without-openib --without-verbs
>> >
>> > Now, openib is not enabled (confirmed from ompi_info script) and there
>> is
>> > no "openib.so" file in $prefix/lib/openmpi directory as well. Still,
>> above
>> > mentioned mpirun commands are getting the same latency and bandwidth as
>> > that of InfiniBand.
>> >
>> > I tried mpirun in verbose mode with following command and here is the
>> > output
>> >
>> > Command:
>> > mpirun -np 2 -machinefile machines -map-by node --mca btl tcp --mca
>> > btl_base_verbose 30 --mca btl_tcp_if_include em1 ./latency.ompi
>> >
>> > Output:
>> > [node123.prv.sciama.cluster:88310] mca: base: components_register:
>> > registering btl components
>> > [node123.prv.sciama.cluster:88310] mca: base: components_register: found
>> > loaded component tcp
>> > [node123.prv.sciama.cluster:88310] mca: base: components_register:
>> > component tcp register function successful
>> > [node123.prv.sciama.cluster:88310] mca: base: components_open: opening
>> btl
>> > components
>> > [node123.prv.sciama.cluster:88310] mca: base: components_open: found
>> > loaded component tcp
>> > [node123.prv.sciama.cluster:88310] mca: base: components_open: component
>> > tcp open function successful
>> > [node124.prv.sciama.cluster:90465] mca: base: components_register:
>> > registering btl components
>> > [node124.prv.sciama.cluster:90465] mca: base: components_register: found
>> > loaded component tcp
>> > [node124.prv.sciama.cluster:90465] mca: base: components_register:
>> > component tcp register function successful
>> > [node124.prv.sciama.cluster:90465] mca: base: components_open: opening
>> btl
>> > components
>> > [node124.prv.sciama.cluster:90465] mca: base: components_open: found
>> > loaded component tcp
>> > [node124.prv.sciama.cluster:90465] mca: base: components_open: component
>> > tcp open function successful
>> > Hello, world.  I am 1 on node124
>> > Hello, world.  I am 0 on node123
>> > Size Latency(usec) Bandwidth(Mbps)
>> > 1    4.18    1.83
>> > 2    3.66    4.17
>> > 4    4.08    7.48
>> > 8    3.12    19.57
>> > 16    3.83    31.84
>> > 32    3.40    71.84
>> > 64    4.10    118.97
>> > 128    3.89    251.19
>> > 256    4.22    462.77
>> > 512    2.95    1325.71
>> > 1024    2.63    2969.49
>> > 2048    3.38    4628.29
>> > [node123.prv.sciama.cluster:88310] mca: base: close: component tcp
>> closed
>> > [node123.prv.sciama.cluster:88310] mca: base: close: unloading component
>> > tcp
>> > [node124.prv.sciama.cluster:90465] mca: base: close: component tcp
>> closed
>> > [node124.prv.sciama.cluster:90465] mca: base: close: unloading component
>> > tcp
>> >
>> > Moreover, same benchmark applications using MPICH are working fine on
>> > Ethernet and achieving expected Latency and Bandwidth.
>> >
>> > How can this be fixed?
>> >
>> > Thanks for help,
>> >
>> > --Ansar
>>
>
>
>
>
> --
> Regards
>
> Ansar Javed
> HPC Lab
> SEECS NUST
> Contact: +92 334 438 9394
> Skype: ansar.javed.859
> Email: muhammad.an...@seecs.edu.pk
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/09/25299.php
>

Re: [OMPI users] Forcing OpenMPI to use Ethernet interconnect instead of InfiniBand

Reply via email to