I was doing a test on our IB based cluster, where I was diabling IB

--mca btl ^openib --mca mtl ^mxm

I was sending very large messages >1GB  and I was surppised by the speed.

I noticed then that of all our ethernet interfaces

eth0  (1gig-e)
ib0  (ip over ib, for lustre configuration at vendor request)
eoib0  (ethernet over IB interface for IB -> Ethernet gateway for some extrnal 
storage support at >1Gig speed

I saw all three were getting traffic.

We use torque for our Resource Manager and use TM support, the hostnames given 
by torque match the eth0 interfaces.

How does OMPI figure out that it can also talk over the others?  How does it 
chose to load balance?

BTW that is fine, but we will use if_exclude on one of the IB ones as ib0 and 
eoib0  are the same physical device and may screw with load balancing if anyone 
ver falls back to TCP.

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985



Reply via email to