Hello Gilles
Thanks for your prompt response and apologies for the delayed response.
The hang issue is fixed now. It seems that OpenMPI seems to prefer PSM when
it detects Qlogic HCAs, even when I pass -mca btl openib,self. Adding
another parameter, -mca pml ob1 fixed the issue. There is nothing
if you are running with master, i recommend you
mpirun --mca mpi_add_procs_cutoff 1024 ...
in order to avoid the crash i just reported at
https://github.com/open-mpi/ompi/issues/1501
Cheers,
Gilles
On 3/28/2016 4:44 PM, Gilles Gouaillardet wrote:
at first, does it hang when running on only
at first, does it hang when running on only one node ?
when the hang occur, you can collect stack traces
(run pstack on mpitest)
to see where it hangs.
since you configure'd with --disable-dlopen, it means your btl has been
slurped into openmpi.
that means some parts of it are executed, and it
Hello Gilles
Per your suggestion, installing libnl3-devel does fixes the mpicc issue,
but there still seems to be another issue down the road: the generated
executable seems to hang. I have tried sm, tcp and openib BTLs, all with
the same result:
[durga@smallMPI ~]$ mpirun -np 2 -H smallMPI,bigMP
If you do not use --disable-dlopen, then some components will depend on
libnl, and some other will depend on libnl3. some might even depend on
both libnl and libnl3.
so based on which component is loaded, you might or might not run into
this issue.
on my centos 7 virtual machine, libnl-devel an
Hello Gilles
Thank you very much for your prompt response!
Here are the answers to your questions:
[durga@smallMPI ~]$ ldd `which mpicc` | grep libnl
libnl.so.1 => /lib64/libnl.so.1 (0x7f79b2d8a000)
libnl-route-3.so.200 => /lib64/libnl-route-3.so.200 (0x7f79b1c44000)
libnl-3.
Does this happen only with master ?
what does
ldd mpicc
says ?
does it require both libnl and libnl3 ?
libnl3 is used by OpenMPI if libnl3-devel package is installed,
and this is not the case on your system
a possible root cause is third party libs use libnl3, and the
reachable/netlink compone
Hello all
The system in question is a CentOS 7 box, that has been running OpenMPI,
both the master branch and the 1.10.2 release happily until now.
Just now, in order to debug something, I recompiled with the following
options:
$ ./configure --enable-debug --enable-debug-symbols --disable-dlopen