On 25 October 2007 at 07:54, Jeff Squyres wrote: | We will not dlopen libibverbs.so directly -- we will only dlopen the | mca_btl_openib.so file. The dynamic linker will automatically open | all of its dependencies. If those dependencies cannot be found / | symbols cannot be resolved, the dynamic linker will fail the dlopen | of libibverbs. | | Can you run "ldd mca_btl_openib.so" on your head node and your | compute nodes? See if there's a difference in the output. I think | this is the next step in this troubleshooting process...
Sure, good idea. head and build machine: $ ldd /usr/lib/openmpi/mca_btl_openib.so linux-gate.so.1 => (0xffffe000) libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0xb7f42000) libpthread.so.0 => /lib/libpthread.so.0 (0xb7f2b000) libmpi.so.0 => /usr/lib/libmpi.so.0 (0xb7ea6000) libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0xb7e52000) libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0xb7dfb000) libdl.so.2 => /lib/libdl.so.2 (0xb7df7000) libnsl.so.1 => /lib/libnsl.so.1 (0xb7de1000) libutil.so.1 => /lib/libutil.so.1 (0xb7ddd000) libm.so.6 => /lib/libm.so.6 (0xb7db7000) libc.so.6 => /lib/libc.so.6 (0xb7c8a000) /lib/ld-linux.so.2 (0x80000000) compute node: $ ldd /usr/lib/openmpi/mca_btl_openib.so /usr/lib/openmpi/mca_btl_openib.so: /usr/lib/libibverbs.so.1: version `IBVERBS_1.1' not found (required by /usr/lib/openmpi/mca_btl_openib.so) linux-gate.so.1 => (0xffffe000) libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0xb7ee6000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ecf000) libmpi.so.0 => /usr/lib/libmpi.so.0 (0xb7e4a000) libopen-rte.so.0 => /usr/lib/libopen-rte.so.0 (0xb7df6000) libopen-pal.so.0 => /usr/lib/libopen-pal.so.0 (0xb7d9f000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7d9b000) libnsl.so.1 => /lib/tls/i686/cmov/libnsl.so.1 (0xb7d84000) libutil.so.1 => /lib/tls/i686/cmov/libutil.so.1 (0xb7d80000) libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0xb7d58000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c17000) libsysfs.so.2 => /lib/libsysfs.so.2 (0xb7c0c000) /lib/ld-linux.so.2 (0x80000000) Bingo!! And I am being found with my package install being inconsistent. Tst tst. I *think* this may be due to the fact that at one point before "we" (as in the few folks looking after the .deb for Open MPI) had learned about the 'btl ^openib' option and I had become so disenchanted with the 'noisy' message that I hacked libibverbs. That may explain the head-node. Let me get that one back to the pristine Ubuntu / Debian package, and then to possibly rebuild the Open MPI package there to correct depends going. Thanks so much for your help and patience on this. Dirk -- Three out of two people have difficulties with fractions.