Re: [OMPI users] How OMPI picks ethernet interfaces

Brock Palen Sat, 8 Nov 2014 23:13:46 -0500 (EST)

Ok I figured, i'm going to have to read some more for my own curiosity. The 
reason I mention the Resource Manager we use, and that the hostnames given but 
PBS/Torque match the 1gig-e interfaces, i'm curious what path it would take to 
get to a peer node when the node list given all match the 1gig interfaces but 
yet data is being sent out the 10gig eoib0/ib0 interfaces.


I'll go do some measurements and see.

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985



> On Nov 8, 2014, at 8:30 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
> wrote:
> 
> Ralph is right: OMPI aggressively uses all Ethernet interfaces by default.  
> 
> This short FAQ has links to 2 other FAQs that provide detailed information 
> about reachability:
> 
>    http://www.open-mpi.org/faq/?category=tcp#tcp-multi-network
> 
> The usNIC BTL uses UDP for its wire transport and actually does a much more 
> standards-conformant peer reachability determination (i.e., it actually 
> checks routing tables to see if it can reach a given peer which has all kinds 
> of caching benefits, kernel controls if you want them, etc.).  We haven't 
> back-ported this to the TCP BTL because a) most people who use TCP for MPI 
> still use a single L2 address space, and b) no one has asked for it.  :-)
> 
> As for the round robin scheduling, there's no indication from the Linux TCP 
> stack what the bandwidth is on a given IP interface.  So unless you use the 
> btl_tcp_bandwidth_<IP_INTERFACE_NAME> (e.g., btl_tcp_bandwidth_eth0) MCA 
> params, OMPI will round-robin across them equally.
> 
> If you have multiple IP interfaces sharing a single physical link, there will 
> likely be no benefit from having Open MPI use more than one of them.  You 
> should probably use btl_tcp_if_include / btl_tcp_if_exclude to select just 
> one.
> 
> 
> 
> 
> On Nov 7, 2014, at 2:53 PM, Brock Palen <bro...@umich.edu> wrote:
> 
>> I was doing a test on our IB based cluster, where I was diabling IB
>> 
>> --mca btl ^openib --mca mtl ^mxm
>> 
>> I was sending very large messages >1GB  and I was surppised by the speed.
>> 
>> I noticed then that of all our ethernet interfaces
>> 
>> eth0  (1gig-e)
>> ib0  (ip over ib, for lustre configuration at vendor request)
>> eoib0  (ethernet over IB interface for IB -> Ethernet gateway for some 
>> extrnal storage support at >1Gig speed
>> 
>> I saw all three were getting traffic.
>> 
>> We use torque for our Resource Manager and use TM support, the hostnames 
>> given by torque match the eth0 interfaces.
>> 
>> How does OMPI figure out that it can also talk over the others?  How does it 
>> chose to load balance?
>> 
>> BTW that is fine, but we will use if_exclude on one of the IB ones as ib0 
>> and eoib0  are the same physical device and may screw with load balancing if 
>> anyone ver falls back to TCP.
>> 
>> Brock Palen
>> www.umich.edu/~brockp
>> CAEN Advanced Computing
>> XSEDE Campus Champion
>> bro...@umich.edu
>> (734)936-1985
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25709.php
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25713.php

Re: [OMPI users] How OMPI picks ethernet interfaces

Reply via email to