Jeff Squyres wrote:
Sorry for the delay in replying.
What exactly is the relay program timing? Can you run a standard
benchmark like NetPIPE, perchance? (http://www.scl.ameslab.gov/netpipe/)
It gives very similar numbers to osu_latency. Turns out the mca btl seems to
be completely ignored, I.e.:
[bill@compute-0-0 relay]$ mpirun -np 2 -mca btl foo -machinefile m ./relay 1
compute-0-0.local compute-0-1.local
size= 1, 131072 hops, 2 nodes in 0.266 sec ( 2.027 us/hop) 1928 KB/sec
Or:
mpirun -np 2 -mca btl foo -machinefile m \
/usr/mpi/gcc/openmpi-1.2.6/tests/osu_benchmarks-3.0/osu_bw
# OSU MPI Bandwidth Test v3.0
# Size Bandwidth (MB/s)
1 2.40
...
My understanding is that -mca btl foo should fail since there isn't a
transport layer called foo.
[bill@compute-0-0 relay]$ which mpirun
/usr/mpi/gcc/openmpi-1.2.6/bin/mpirun
ldd ./relay
libm.so.6 => /lib64/libm.so.6 (0x00002aaaaacc7000)
libmpi.so.0 => /usr/mpi/gcc/openmpi-1.2.6/lib64/libmpi.so.0
(0x00002aaaaaf4a000)
libopen-rte.so.0 => /usr/mpi/gcc/openmpi-1.2.6/lib64/libopen-rte.so.0
(0x00002aaaab1d8000)
libopen-pal.so.0 => /usr/mpi/gcc/openmpi-1.2.6/lib64/libopen-pal.so.0
(0x00002aaaab433000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab692000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x00002aaaab896000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002aaaabaaf000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaabcb2000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaaabecc000)
/lib64/ld-linux-x86-64.so.2 (0x00002aaaaaaab000)
So OFED-1.3.1 (or an openmpi build from source) ./install.pl works with TCP,
but not infinipath (because of a missing psm library). All the "-mca btl"
functionality works as expected.
OFED-1.3.1 (or an openmpi build from source) when I add "--with-psm" works
with infinipath, but all -mca parameters are ignored. Is there a way to get
openmpi working with infinipath without the psm library? Or a suggestion on
how to get the -mca functionality working?