It turns out that stuff in /etc is in RAM, so the mxm.conf wasn't there
because that area hadn't been refreshed yet, either by the admin
manually pushing it out or by rebooting. The admins pushed it out, and
now ldd on libhcoll.so resolves the libmxm.so dependency. And, configure
works without having to specify LD_LIBRARY_PATH.
So, not an Open MPI issue, but I am very grateful for all the help!
David
On 10/21/2015 12:00 PM, David Shrader wrote:
I'm sorry I missed reporting on that. I do not have
/etc/ld.so.conf.d/mxm.conf.
Interestingly enough, the rpm reports that it does include that file,
but it isn't there:
[dshrader@zo-fe1 serial]$ rpm -qa | grep mxm
mxm-3.4.3065-1.x86_64
[dshrader@zo-fe1 serial]$ rpm -ql mxm-3.4.3065-1.x86_64
/etc/ld.so.conf.d/mxm.conf
...output snipped...
[dshrader@zo-fe1 serial]$ ll /etc/ld.so.conf.d/mxm.conf
ls: cannot access /etc/ld.so.conf.d/mxm.conf: No such file or directory
I'll follow up with the admin who installed the rpm.
Thanks,
David
On 10/21/2015 11:37 AM, Mike Dubman wrote:
could you please check if you have file /etc/ld.so.conf.d/mxm.conf on
your system?
it will help us understand why hcoll did not detect libmxm.so at the
1st attempt.
Thanks
On Wed, Oct 21, 2015 at 7:19 PM, David Shrader <dshra...@lanl.gov
<mailto:dshra...@lanl.gov>> wrote:
We're using TOSS which is based on Red Hat. The current version
we're running is based on Red Hat 6.6. I'm actually not sure what
mofed version we're using right now based on what I can find on
the system and the admins over that are out. I'll get back to you
on that as soon as I know.
Using LD_LIBRARY_PATH before configure got it to work, which I
didn't expect. Thanks for the tip! I didn't realize that loading
in a shared library of a library that is being linked in on the
active compile line fell under the runtime portion of linking,
and could be affected by using LD_LIBRARY_PATH.
Thanks!
David
On 10/21/2015 09:59 AM, Mike Dubman wrote:
Hi David,
what linux distro do you use? (and mofed version)?
Do you have /etc/ld.conf.d/mxm.conf file?
Can you please try add LD_LIBRARY_PATH=/opt/mellanox/mxm/lib
./configure ....?
Thanks
On Wed, Oct 21, 2015 at 6:40 PM, David Shrader
<dshra...@lanl.gov> wrote:
I should probably point out that libhcoll.so does not know
where libmxm.so is:
[dshrader@zo-fe1 ~]$ ldd /opt/mellanox/hcoll/lib/libhcoll.so
linux-vdso.so.1 => (0x00007fffb2f1f000)
libibnetdisc.so.5 => /usr/lib64/libibnetdisc.so.5
(0x00007fe31bd0b000)
libmxm.so.2 => not found
libz.so.1 => /lib64/libz.so.1 (0x00007fe31baf4000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fe31b8f0000)
libosmcomp.so.3 => /usr/lib64/libosmcomp.so.3
(0x00007fe31b6e2000)
libocoms.so.0 =>
/opt/mellanox/hcoll/lib/libocoms.so.0 (0x00007fe31b499000)
libm.so.6 => /lib64/libm.so.6 (0x00007fe31b215000)
libnuma.so.1 => /usr/lib64/libnuma.so.1
(0x00007fe31b009000)
libalog.so.0 => /opt/mellanox/hcoll/lib/libalog.so.0
(0x00007fe31adfe000)
librt.so.1 => /lib64/librt.so.1 (0x00007fe31abf6000)
libibumad.so.3 => /usr/lib64/libibumad.so.3
(0x00007fe31a9ee000)
librdmacm.so.1 => /usr/lib64/librdmacm.so.1
(0x00007fe31a7d9000)
libibverbs.so.1 => /usr/lib64/libibverbs.so.1
(0x00007fe31a5c7000)
libpthread.so.0 => /lib64/libpthread.so.0
(0x00007fe31a3a9000)
libc.so.6 => /lib64/libc.so.6 (0x00007fe31a015000)
libglib-2.0.so.0 => /lib64/libglib-2.0.so.0
(0x00007fe319cfe000)
libibmad.so.5 => /usr/lib64/libibmad.so.5
(0x00007fe319ae3000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe31c2d3000)
libwrap.so.0 => /lib64/libwrap.so.0 (0x00007fe3198d8000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1
(0x00007fe3196c2000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00007fe3194a8000)
libutil.so.1 => /lib64/libutil.so.1 (0x00007fe3192a5000)
libnl.so.1 => /lib64/libnl.so.1 (0x00007fe319052000)
Both hcoll and mxm where installed using the rpms provided
by Mellanox.
Thanks again,
David
On 10/21/2015 09:34 AM, David Shrader wrote:
Hello All,
I'm currently trying to install 1.10.0 with hcoll and
mxm, and am getting an error during configure:
--- MCA component coll:hcoll (m4 configuration macro)
checking for MCA component coll:hcoll compile mode... static
checking hcoll/api/hcoll_api.h usability... yes
checking hcoll/api/hcoll_api.h presence... yes
checking for hcoll/api/hcoll_api.h... yes
looking for library in lib
checking for library containing hcoll_get_version... no
looking for library in lib64
checking for library containing hcoll_get_version... no
configure: error: HCOLL support requested but not
found. Aborting
The configure line I used:
./configure --with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll
--with-platform=contrib/platform/lanl/toss/optimized-panasas
Here are the corresponding lines from config.log:
configure:217014: gcc -std=gnu99 -o conftest -O3
-DNDEBUG -I/opt/panfs/include -finline-functions
-fno-strict-aliasing -pthread
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/hwloc/hwloc191/hwloc/include
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/event/libevent2021/libevent
-I/usr/projects/hpctools/dshrader/hpcsoft/openmpi/1.10.0/openmpi-1.10.0/opal/mca/event/libevent2021/libevent/include
-I/opt/mellanox/hcoll/include -L/opt/mellanox/hcoll/lib
conftest.c -lhcoll -lrt -lm -lutil >&5
/usr/bin/ld: warning: libmxm.so.2, needed by
/opt/mellanox/hcoll/lib/libhcoll.so, not found (try
using -rpath or -rpath-link)
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_req_recv'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_ep_create'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_config_free_context_opts'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_ep_destroy'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_config_free_ep_opts'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_progress'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_config_read_opts'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_ep_disconnect'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_mq_destroy'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_mq_create'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_cleanup'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_req_send'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_ep_connect'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_init'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_ep_get_address'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_error_string'
/opt/mellanox/hcoll/lib/libhcoll.so: undefined reference
to `mxm_mem_unmap'
collect2: ld returned 1 exit status
An ldd on /opt/mellanox/hcoll/lib/libhcoll.so shows a
dependency on libmxm.so, so the above error makes sense.
I am using hcoll version 3.3.768 and mxm version
3.4.3065 (reported by rpm).
So, my question: is there a way to take care of this
other than putting '-L/opt/mellanox/lib -lmxm' in to
LDFLAGS/LIBS? Using LDFLAGS/LIBS will link mxm in to
everything, which I would prefer not to do.
Thanks in advance!
David
--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at> lanl.gov <http://lanl.gov>
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/10/27907.php
--
Kind Regards,
M.
_______________________________________________ users mailing
list us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this
post:http://www.open-mpi.org/community/lists/users/2015/10/27908.php
--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at>lanl.gov <http://lanl.gov>
_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/10/27909.php
--
Kind Regards,
M.
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this
post:http://www.open-mpi.org/community/lists/users/2015/10/27910.php
--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at> lanl.gov
--
David Shrader
HPC-3 High Performance Computer Systems
Los Alamos National Lab
Email: dshrader <at> lanl.gov