If you do not use --disable-dlopen, then some components will depend on libnl, and some other will depend on libnl3. some might even depend on both libnl and libnl3. so based on which component is loaded, you might or might not run into this issue.
on my centos 7 virtual machine, libnl-devel and libnl3-devel are installed.
everything seems fine except
ompi_info --all
that crashes.

if you use --disable-dlopen, components are no more in their own library, but they are all "merged" into the same one. consequently, if some components require libnl and other require libnl3, the merged library will depend on both libnl and libnl3, so you will surely run into this kind of issues.

Cheers,

Gilles

On 3/28/2016 3:51 PM, dpchoudh . wrote:
Hello Gilles

Thank you very much for your prompt response!

Here are the answers to your questions:

[durga@smallMPI ~]$ ldd `which mpicc` | grep libnl
    libnl.so.1 => /lib64/libnl.so.1 (0x00007f79b2d8a000)
libnl-route-3.so.200 => /lib64/libnl-route-3.so.200 (0x00007f79b1c44000)
    libnl-3.so.200 => /lib64/libnl-3.so.200 (0x00007f79b1a28000)

So yes, mpicc does seem to need both libnl and libnl3. And this is even though libnl3-devel is NOT installed on my system:

[durga@smallMPI ~]$ sudo yum list installed | grep libnl
libnl.x86_64 1.1.4-3.el7                     @anaconda
libnl-devel.x86_64 1.1.4-3.el7                     @anaconda
libnl3.x86_64 3.2.21-10.el7                   @base
libnl3-cli.x86_64 3.2.21-10.el7                   @base


Could it be because of the --disable-dlopen switch to ./configure? The other two switches (--enable-debug and --enable-debug-symbols seem pretty harmless).

I'll try your other suggestion and let you know the outcome shortly.

Thanks
Durga


We learn from history that we never learn from history.

On Mon, Mar 28, 2016 at 2:37 AM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote:

    Does this happen only with master ?

    what does
    ldd mpicc
    says ?
    does it require both libnl and libnl3 ?

    libnl3 is used by OpenMPI if libnl3-devel package is installed,
    and this is not the case on your system

    a possible root cause is third party libs use libnl3, and the
    reachable/netlink component
    tries to use libnl, in this case, installing libnl3-devel should
    fix your issue
    /* you will need to re-configure after that */

    an other possible root cause is some third party libs use libnl
    and other use libnl3,
    and in this case, i am afraid there is no simple workaround.
    if installing libnl3-devel did not solve your issue, you can give
    a try to
    https://github.com/open-mpi/ompi/pull/1014
    at least, it will abort with an error message that states which
    lib is using libnl and which is using libnl3

    i am afraid the only option is to manually disable some
    components, so only one flavor of lib nl is used.
    that can be achieved by adding a .opal_ignore empty file in the
    dir of the components you want to disable.
    /* you will need to rerun autogen.pl <http://autogen.pl> after that */

    Cheers,

    Gilles

    On 3/28/2016 3:16 PM, dpchoudh . wrote:
    Hello all

    The system in question is a CentOS 7 box, that has been running
    OpenMPI, both the master branch and the 1.10.2 release happily
    until now.

    Just now, in order to debug something, I recompiled with the
    following options:

    $ ./configure --enable-debug --enable-debug-symbols --disable-dlopen

    The compilation and install was successful; however, mpicc now
    crashes like this:

    [durga@smallMPI ~]$ mpicc -Wall -Wextra -o mpitest mpitest.c
    mpicc: route/tc.c:973: rtnl_tc_register: Assertion `0' failed.
    Aborted (core dumped)


    Searching the mailing archive, I found two posts that describe
    similar situations:

    https://www.open-mpi.org/community/lists/devel/2015/08/17812.php
    http://www.open-mpi.org/community/lists/users/2015/11/28016.php

    However, the solution proposed in these, to disable verbs, is not
    acceptable to me for the following reasons: I am trying to
    implement a new BTL by reverse engineering the openib BTL. I am
    using a Qlogic HCA for this purpose. (Please note that I cannot
    use PSM as I am writing code for a BTL)

    As there any more acceptable solutions for this? Here are the
    list of nl libraries on my box:

    [durga@smallMPI ~]$ sudo yum list installed | grep libnl
    libnl.x86_64 1.1.4-3.el7                     @anaconda
    libnl-devel.x86_64 1.1.4-3.el7                     @anaconda
    libnl3.x86_64 3.2.21-10.el7                   @base
    libnl3-cli.x86_64 3.2.21-10.el7                   @base

and uninstalling libnl3 is not an option either: it seems yum wants to uninstall around 100 odd other packages because of
    dependency which will essentially render the machine unusable.

     Please help!

    Thanks in advance
    Durga

    We learn from history that we never learn from history.


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this 
post:http://www.open-mpi.org/community/lists/users/2016/03/28855.php


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2016/03/28856.php




_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2016/03/28857.php

Reply via email to