I built open-mpi 1.6.1 using the open-mx libraries. This worked previously and now I get the following error. Here is my system:
kernel: 2.6.32-279.5.1.el6.x86_64 open-mx: 1.5.2 BTW, open-mx worked previously with open-mpi and the current version works with mpich2 $ mpiexec -np 8 -machinefile machines cpi Process 0 on limulus FatalError: Failed to lookup peer by addr, driver replied Bad file descriptor cpi: ../omx_misc.c:89: omx__ioctl_errno_to_return_checked: Assertion `0' failed. [limulus:04448] *** Process received signal *** [limulus:04448] Signal: Aborted (6) [limulus:04448] Signal code: (-6) [limulus:04448] [ 0] /lib64/libpthread.so.0() [0x3324e0f500] [limulus:04448] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x33246328a5] [limulus:04448] [ 2] /lib64/libc.so.6(abort+0x175) [0x3324634085] [limulus:04448] [ 3] /lib64/libc.so.6() [0x332462ba1e] [limulus:04448] [ 4] /lib64/libc.so.6(__assert_perror_fail+0) [0x332462bae0] [limulus:04448] [ 5] /usr/open-mx/lib/libopen-mx.so.0(omx__ioctl_errno_to_return_checked+0x197) [0x7fb587418b37] [limulus:04448] [ 6] /usr/open-mx/lib/libopen-mx.so.0(omx__peer_addr_to_index+0x55) [0x7fb58741a5d5] [limulus:04448] [ 7] /usr/open-mx/lib/libopen-mx.so.0(+0xdc7a) [0x7fb587419c7a] [limulus:04448] [ 8] /usr/open-mx/lib/libopen-mx.so.0(omx_connect+0x8c) [0x7fb58741a27c] [limulus:04448] [ 9] /usr/open-mx/lib/libopen-mx.so.0(mx_connect+0x15) [0x7fb587425865] [limulus:04448] [10] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(mca_btl_mx_proc_connect+0x5e) [0x7fb5876fe40e] [limulus:04448] [11] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(mca_btl_mx_send+0x2d4) [0x7fb5876fbd94] [limulus:04448] [12] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(mca_pml_ob1_send_request_start_prepare+0xcb) [0x7fb58777d6fb] [limulus:04448] [13] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(mca_pml_ob1_isend+0x4cb) [0x7fb58777509b] [limulus:04448] [14] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(ompi_coll_tuned_bcast_intra_generic+0x37b) [0x7fb58770b55b] [limulus:04448] [15] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(ompi_coll_tuned_bcast_intra_binomial+0xd8) [0x7fb58770b8b8] [limulus:04448] [16] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(ompi_coll_tuned_bcast_intra_dec_fixed+0xcc) [0x7fb587702d8c] [limulus:04448] [17] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(mca_coll_sync_bcast+0x78) [0x7fb587712e88] [limulus:04448] [18] /opt/mpi/openmpi-gnu4/lib64/libmpi.so.1(MPI_Bcast+0x130) [0x7fb5876ce1b0] [limulus:04448] [19] cpi(main+0x10b) [0x400cc4] [limulus:04448] [20] /lib64/libc.so.6(__libc_start_main+0xfd) [0x332461ecdd] [limulus:04448] [21] cpi() [0x400ac9] [limulus:04448] *** End of error message *** Process 2 on limulus Process 4 on limulus Process 6 on limulus Process 1 on n0 Process 7 on n0 Process 3 on n0 Process 5 on n0 -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 4448 on node limulus exited on signal 6 (Aborted). -------------------------------------------------------------------------- -- Doug -- Mailscanner: Clean