It turns out that stuff in /etc is in RAM, so the mxm.conf wasn't there because that area hadn't been refreshed yet, either by the admin manually pushing it out or by rebooting. The admins pushed it out, and now ldd on libhcoll.so resolves the libmxm.so dependency. And, configure works without having to specify LD_LIBRARY_PATH.

So, not an Open MPI issue, but I am very grateful for all the help!
David

On 10/21/2015 12:00 PM, David Shrader wrote:
I'm sorry I missed reporting on that. I do not have /etc/ld.so.conf.d/mxm.conf.

Interestingly enough, the rpm reports that it does include that file, but it isn't there:

[dshrader@zo-fe1 serial]$ rpm -qa | grep mxm
mxm-3.4.3065-1.x86_64
[dshrader@zo-fe1 serial]$ rpm -ql mxm-3.4.3065-1.x86_64
/etc/ld.so.conf.d/mxm.conf
...output snipped...
[dshrader@zo-fe1 serial]$ ll /etc/ld.so.conf.d/mxm.conf
ls: cannot access /etc/ld.so.conf.d/mxm.conf: No such file or directory

I'll follow up with the admin who installed the rpm.

Thanks,
David

On 10/21/2015 11:37 AM, Mike Dubman wrote:
could you please check if you have file /etc/ld.so.conf.d/mxm.conf on your system? it will help us understand why hcoll did not detect libmxm.so at the 1st attempt.

Thanks

On Wed, Oct 21, 2015 at 7:19 PM, David Shrader <dshra...@lanl.gov <mailto:dshra...@lanl.gov>> wrote:

    We're using TOSS which is based on Red Hat. The current version
    we're running is based on Red Hat 6.6. I'm actually not sure what
    mofed version we're using right now based on what I can find on
    the system and the admins over that are out. I'll get back to you
    on that as soon as I know.

    Using LD_LIBRARY_PATH before configure got it to work, which I
    didn't expect. Thanks for the tip! I didn't realize that loading
    in a shared library of a library that is being linked in on the
    active compile line fell under the runtime portion of linking,
    and could be affected by using LD_LIBRARY_PATH.

    Thanks!
    David


    On 10/21/2015 09:59 AM, Mike Dubman wrote:
    Hi David,
    what linux distro do you use? (and mofed version)?
    Do you have /etc/ld.conf.d/mxm.conf file?
    Can you please try add LD_LIBRARY_PATH=/opt/mellanox/mxm/lib
    ./configure ....?


    Thanks

    On Wed, Oct 21, 2015 at 6:40 PM, David Shrader
    <dshra...@lanl.gov> wrote:

        I should probably point out that libhcoll.so does not know
        where libmxm.so is:

        [dshrader@zo-fe1 ~]$ ldd /opt/mellanox/hcoll/lib/libhcoll.so
                linux-vdso.so.1 => (0x00007fffb2f1f000)
                libibnetdisc.so.5 => /usr/lib64/libibnetdisc.so.5
        (0x00007fe31bd0b000)
                libmxm.so.2 => not found
                libz.so.1 => /lib64/libz.so.1 (0x00007fe31baf4000)
                libdl.so.2 => /lib64/libdl.so.2 (0x00007fe31b8f0000)
                libosmcomp.so.3 => /usr/lib64/libosmcomp.so.3
        (0x00007fe31b6e2000)
                libocoms.so.0 =>
        /opt/mellanox/hcoll/lib/libocoms.so.0 (0x00007fe31b499000)
                libm.so.6 => /lib64/libm.so.6 (0x00007fe31b215000)
                libnuma.so.1 => /usr/lib64/libnuma.so.1
        (0x00007fe31b009000)
                libalog.so.0 => /opt/mellanox/hcoll/lib/libalog.so.0
        (0x00007fe31adfe000)
                librt.so.1 => /lib64/librt.so.1 (0x00007fe31abf6000)
                libibumad.so.3 => /usr/lib64/libibumad.so.3
        (0x00007fe31a9ee000)
                librdmacm.so.1 => /usr/lib64/librdmacm.so.1
        (0x00007fe31a7d9000)
                libibverbs.so.1 => /usr/lib64/libibverbs.so.1
        (0x00007fe31a5c7000)
                libpthread.so.0 => /lib64/libpthread.so.0
        (0x00007fe31a3a9000)
                libc.so.6 => /lib64/libc.so.6 (0x00007fe31a015000)
                libglib-2.0.so.0 => /lib64/libglib-2.0.so.0
        (0x00007fe319cfe000)
                libibmad.so.5 => /usr/lib64/libibmad.so.5
        (0x00007fe319ae3000)
                /lib64/ld-linux-x86-64.so.2 (0x00007fe31c2d3000)
                libwrap.so.0 => /lib64/libwrap.so.0 (0x00007fe3198d8000)
                libgcc_s.so.1 => /lib64/libgcc_s.so.1
        (0x00007fe3196c2000)
                libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fe3194a8000)
                libutil.so.1 => /lib64/libutil.so.1 (0x00007fe3192a5000)
                libnl.so.1 => /lib64/libnl.so.1 (0x00007fe319052000)

        Both hcoll and mxm where installed using the rpms provided
        by Mellanox.

        Thanks again,
        David


        On 10/21/2015 09:34 AM, David Shrader wrote:

            Hello All,

            I'm currently trying to install 1.10.0 with hcoll and
            mxm, and am getting an error during configure:

            --- MCA component coll:hcoll (m4 configuration macro)
            checking for MCA component coll:hcoll compile mode... static
            checking hcoll/api/hcoll_api.h usability... yes
            checking hcoll/api/hcoll_api.h presence... yes
            checking for hcoll/api/hcoll_api.h... yes
            looking for library in lib
            checking for library containing hcoll_get_version... no
            looking for library in lib64
            checking for library containing hcoll_get_version... no
            configure: error: HCOLL support requested but not
            found.  Aborting

            The configure line I used:

            ./configure --with-mxm=/opt/mellanox/mxm
            --with-hcoll=/opt/mellanox/hcoll
            --with-platform=contrib/platform/lanl/toss/optimized-panasas

            Here are the corresponding lines from config.log:

            configure:217014: gcc -std=gnu99 -o conftest -O3
            -DNDEBUG -I/opt/panfs/include -finline-functions
            -fno-strict-aliasing -pthread
            
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/hwloc/hwloc191/hwloc/include
            
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/event/libevent2021/libevent
            
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/event/libevent2021/libevent/include
            -I/opt/mellanox/hcoll/include  -L/opt/mellanox/hcoll/lib
            conftest.c -lhcoll  -lrt -lm -lutil   >&5
            /usr/bin/ld: warning: libmxm.so.2, needed by
            /opt/mellanox/hcoll/lib/libhcoll.so, not found (try
            using -rpath or -rpath-link)
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_req_recv'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_ep_create'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_config_free_context_opts'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_ep_destroy'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_config_free_ep_opts'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_progress'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_config_read_opts'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_ep_disconnect'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_mq_destroy'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_mq_create'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_cleanup'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_req_send'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_ep_connect'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_init'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_ep_get_address'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_error_string'
            /opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
            to `mxm_mem_unmap'
            collect2: ld returned 1 exit status

            An ldd on /opt/mellanox/hcoll/lib/libhcoll.so shows a
            dependency on libmxm.so, so the above error makes sense.
            I am using hcoll version 3.3.768 and mxm version
            3.4.3065 (reported by rpm).

            So, my question: is there a way to take care of this
            other than putting '-L/opt/mellanox/lib -lmxm' in to
            LDFLAGS/LIBS? Using LDFLAGS/LIBS will link mxm in to
            everything, which I would prefer not to do.

            Thanks in advance!
            David


-- David Shrader
        HPC-3 High Performance Computer Systems
        Los Alamos National Lab
        Email: dshrader <at> lanl.gov <http://lanl.gov>

        _______________________________________________
        users mailing list
        us...@open-mpi.org <mailto:us...@open-mpi.org>
        Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
        Link to this post:
        http://www.open-mpi.org/community/lists/users/2015/10/27907.php




--
    Kind Regards,

    M.


    _______________________________________________ users mailing
    list us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users

    Link to this 
post:http://www.open-mpi.org/community/lists/users/2015/10/27908.php

-- David Shrader
    HPC-3 High Performance Computer Systems
    Los Alamos National Lab
    Email: dshrader <at>lanl.gov <http://lanl.gov>


    _______________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
    Link to this post:
    http://www.open-mpi.org/community/lists/users/2015/10/27909.php




--

Kind Regards,

M.


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this 
post:http://www.open-mpi.org/community/lists/users/2015/10/27910.php

--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at> lanl.gov

--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at> lanl.gov

Reply via email to