Ralph,
IIRC there is load balancing accros all the btl, for example
between vader and scif.
So load balancing between ib0 and eoib0 is just a particular case that might
not necessarily be handled by the btl tcp.
Cheers,
Gilles
Ralph Castain wrote:
>OMPI discovers all active interfaces and aut
Brock,
Is your post related to ib0/eoib0 being used at all, or being used with load
balancing ?
let me clarify this :
--mca btl ^openib
disables the openib btl aka *native* infiniband.
This does not disable ib0 and eoib0 that are handled by the tcp btl.
As you already figured out, btl_tcp_if_inc
OMPI discovers all active interfaces and automatically considers them available
for its use unless instructed otherwise via the params. I’d have to look at the
TCP BTL code to see the loadbalancing algo - I thought we didn’t have that “on”
by default across BTLs, but I don’t know if the TCP one
I was doing a test on our IB based cluster, where I was diabling IB
--mca btl ^openib --mca mtl ^mxm
I was sending very large messages >1GB and I was surppised by the speed.
I noticed then that of all our ethernet interfaces
eth0 (1gig-e)
ib0 (ip over ib, for lustre configuration at vendor r
Ah, yes - so here is what is happening. When no slot info is provided, we use
the number of detected cores on each node as the #slots. So if you want to
loadbalance across the nodes, you need to set —map-by node
Or add slots=1 to each line of your host file to override the default behavior
> On
Here's my command:
/bin/mpirun --machinefile
hosts.dat -np 4
Here's my hosts.dat file:
% cat hosts.dat
node01
node02
node03
node04
All 4 ranks are launched on node01. I don't believe I've ever seen this
before. I had to do a sanity check, so I tried MVAPICH2-2.1a and got what I
expected:
Let me clarify as that wasn’t very clear… if we enable, or disable, GDR it
doesn’t make a difference. Seems to be in the base code,
Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electronics,
1732 North First Street,
San Jose, CA 95112,
Work: +