[OMPI users] btl_openib_if_include

2018-04-20 Thread Marshall2, John (SSC/SPC)
Hi

I am trying to verify/determine what the proper setting is for 
btl_openib_ib_include.

Some background:
* openmpi 2.1.1 (and 1.6.5 - yes it is old)
* lxc containers
* SRIOV (virtual functions) being used
* dedicated IB interface (e.g., ib2) per container

Should the mlx4_X:1 correspond to a specific ibY interface? E.g., for ib26, I 
find
mlx4_13:1 by:
$ ls /sys/class/net/ib26/device/infiniband
mlx4_13

Does the mlx4_X have to be determined at each location where an mpi task
would run? I suppose it would because the ibY is likely to be different.

On some tests, I have found that the setting:
export OMPI_MCA_btl_openib_if_include=mlx4_0:1

provides better performance than not specifying a value or letting mpirun/orted
figure it out at runtime.

Thanks,
John
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Possible to exclude a hwloc_base_binding_policy?

2018-04-20 Thread Saurabh T
Hi,
Switching to OpenMPI 3, I was getting error messages of the form 
"No objects of the specified type were found on at least one node:
Type: NUMANode
...
ORTE has lost communication with a remote daemon.
..."

After some research, I found that hwloc_base_binding_policy (for np >  2) 
switched to numa for OpenMPI v3 from socket for v2. This is seen  from 
"ompi_info --param all all --level 9". I've verified the switch to  numa is 
causing the failures. If I set it to socket, it works.

My question is, how can I set the variable in openmpi-mca-params.conf to  
exclude numa, ie. use whatever its rules are, except numa. I tried  
"hwloc_base_binding_policy = ^numa" (similar to say "btl = ^sm") but  this 
didnt work. Is what I want possible, or should I live with socket  policy for 
all cases?  

Thank you.
saurabh
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users