On Oct 24, 2007, at 10:05 PM, Dirk Eddelbuettel wrote:
| > | If I had to guess, the systems where you don't see the
warning are
| > | systems that have OFED loaded.
| >
| > I am pretty sure that none of the systems (at work) have IB
| > hardware. I am
| > very sure that my home systems do not, and there the 'btl =
^openib'
| > successfully suppresses the warning --- whereas at work it
doesn't.
|
| Note that you don't need to have IB hardware -- all you need is the
| OFED software loaded. I don't know if Debian ships the OFED
| libraries by default...? In particular, look for libibverbs:
|
| [18:28] svbu-mpi:~/svn/ompi % ldd $bogus/lib/openmpi/
mca_btl_openib.so
| libibverbs.so.1 => /usr/lib64/libibverbs.so.1
| (0x0000002a956c2000)
| libnsl.so.1 => /lib64/libnsl.so.1 (0x0000002a957cd000)
| libutil.so.1 => /lib64/libutil.so.1 (0x0000002a958e4000)
| libm.so.6 => /lib64/tls/libm.so.6 (0x0000002a959e8000)
| libpthread.so.0 => /lib64/tls/libpthread.so.0
| (0x0000002a95b6e000)
| libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a95c83000)
| libdl.so.2 => /lib64/libdl.so.2 (0x0000002a95eb8000)
| /lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)
Good point. However, I use the .deb packages which are I build for
Debian,
and they use libibverbs where available:
Build-Depends: [...], libibverbs-dev [!kfreebsd-i386 !kfreebsd-
amd64 \
!hurd-i386], gfortran, libsysfs-dev, automake, gcc (>= 4:4.1.2)
in particular on i386. Consequently, the binary package ends up with a
Depends on the run-time package 'libibverbs1' -- and this will
hence always
be present as all my systems use the .deb packages (either from
Debian or
locally rebuild) that forces libibverbs1 in via this Depends.
At work, I re-build these same package under Ubuntu on my "head
node". And
on the head node, no warning is seen -- wherease my computes issue the
warning.
Could this be another one of the dlopen issues where basically
ldopen("libibverbs.so")
is executed? Because the compute nodes do NOT have libibverbs.so
(from the
-dev package) but only libibverbs.so.1.0.0 and its matching symlink
libibverbs.so.1.
We will not dlopen libibverbs.so directly -- we will only dlopen the
mca_btl_openib.so file. The dynamic linker will automatically open
all of its dependencies. If those dependencies cannot be found /
symbols cannot be resolved, the dynamic linker will fail the dlopen
of libibverbs.
Can you run "ldd mca_btl_openib.so" on your head node and your
compute nodes? See if there's a difference in the output. I think
this is the next step in this troubleshooting process...
I just tested that hypothesis and install libibverbs-dev, but no
beans. Still
get the warning.
| However, I note something in your last reply that I may have missed
| before -- can you clarify a point for me: are you saying that on
your
| home machine, this generates the openib "file not found" warning:
|
| mpirun -np 2 hello
|
| but this does not:
|
| mpirun -np 2 --mca btl ^openib hello
More or less, but I use /etc/openmpi/openmci-mca-params.conf to toggle
^openib. Adding it again as --mca btl ^openib changes nothing,
unfortunately.
This MCA behavior is as expected; adding a param to openmpi-mca-
params.conf is exactly the same as putting it on the command line
(except that the command line has higher precedence).
--
Jeff Squyres
Cisco Systems