Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Ralph, IIRC there is load balancing accros all the btl, for example between vader and scif. So load balancing between ib0 and eoib0 is just a particular case that might not necessarily be handled by the btl tcp. Cheers, Gilles Ralph Castain wrote: >OMPI discovers all active interfaces and aut

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Brock, Is your post related to ib0/eoib0 being used at all, or being used with load balancing ? let me clarify this : --mca btl ^openib disables the openib btl aka *native* infiniband. This does not disable ib0 and eoib0 that are handled by the tcp btl. As you already figured out, btl_tcp_if_inc

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Ralph Castain
OMPI discovers all active interfaces and automatically considers them available for its use unless instructed otherwise via the params. I’d have to look at the TCP BTL code to see the loadbalancing algo - I thought we didn’t have that “on” by default across BTLs, but I don’t know if the TCP one

[OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Brock Palen
I was doing a test on our IB based cluster, where I was diabling IB --mca btl ^openib --mca mtl ^mxm I was sending very large messages >1GB and I was surppised by the speed. I noticed then that of all our ethernet interfaces eth0 (1gig-e) ib0 (ip over ib, for lustre configuration at vendor r

Re: [OMPI users] Question on mapping processes to hosts file

2014-11-07 Thread Ralph Castain
Ah, yes - so here is what is happening. When no slot info is provided, we use the number of detected cores on each node as the #slots. So if you want to loadbalance across the nodes, you need to set —map-by node Or add slots=1 to each line of your host file to override the default behavior > On

[OMPI users] Question on mapping processes to hosts file

2014-11-07 Thread Blosch, Edwin L
Here's my command: /bin/mpirun --machinefile hosts.dat -np 4 Here's my hosts.dat file: % cat hosts.dat node01 node02 node03 node04 All 4 ranks are launched on node01. I don't believe I've ever seen this before. I had to do a sanity check, so I tried MVAPICH2-2.1a and got what I expected:

Re: [OMPI users] Randomly long (100ms vs 7000+ms) fulfillment of MPI_Ibcast

2014-11-07 Thread Steven Eliuk
Let me clarify as that wasn’t very clear… if we enable, or disable, GDR it doesn’t make a difference. Seems to be in the base code, Kindest Regards, — Steven Eliuk, Ph.D. Comp Sci, Advanced Software Platforms Lab, SRA - SV, Samsung Electronics, 1732 North First Street, San Jose, CA 95112, Work: +