Somebody call orkin. ;-P Well I tried running it with things set as noted in the bug report. However it doesnt change anything on my end. I am willing to do any verification you guys need (time permitting and all). Anything special needed to get mpi_latency to compile ? I can run that to verify that things are actually working on my end.
[root@something ompi]# /opt/ompi/bin/mpirun --mca btl_openmpi_use_eager_rdma 0 -np 2 -hostfile machine.list ./IMB-MPI1 Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR) Failing at addr:0x3000100a819d [0] func:/opt/ompi/lib/libopal.so.0 [0x80001c6e18] [1] func:[0x1ffffffdfa0] [2] func:/opt/ompi/lib/libmpi.so.0 [0x800006516c] [3] func:/opt/ompi/lib/libmpi.so.0 [0x80000652b4] [4] func:/opt/ompi/lib/openmpi/mca_btl_openib.so [0x800056f2f0] [5] func:/opt/ompi/lib/libmpi.so.0 [0x80000d0540] [6] func:/opt/ompi/lib/openmpi/mca_bml_r2.so [0x80005548a8] [7] func:/opt/ompi/lib/libmpi.so.0 [0x80000cfc8c] [8] func:/opt/ompi/lib/openmpi/mca_pml_ob1.so [0x8000533d9c] [9] func:/opt/ompi/lib/libmpi.so.0 [0x80000d9988] [10] func:/opt/ompi/lib/libmpi.so.0 [0x8000087a80] [11] func:/opt/ompi/lib/libmpi.so.0 [0x80000b09ac] [12] func:./IMB-MPI1 [0x10003328] [13] func:/lib64/tls/libc.so.6 [0x8064e9415c] [14] func:/lib64/tls/libc.so.6 [0x8064e942e4] *** End of error message *** [root@something ompi]# /opt/ompi/bin/mpirun --mca btl_openmpi_use_srq 1 -np 2 -hostfile machine.list ./IMB-MPI1 Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR) Failing at addr:0x3000100a819d [0] func:/opt/ompi/lib/libopal.so.0 [0x80001c6e18] [1] func:[0x1ffffffdfa0] [2] func:/opt/ompi/lib/libmpi.so.0 [0x800006516c] [3] func:/opt/ompi/lib/libmpi.so.0 [0x80000652b4] [4] func:/opt/ompi/lib/openmpi/mca_btl_openib.so [0x800056f2f0] [5] func:/opt/ompi/lib/libmpi.so.0 [0x80000d0540] [6] func:/opt/ompi/lib/openmpi/mca_bml_r2.so [0x80005548a8] [7] func:/opt/ompi/lib/libmpi.so.0 [0x80000cfc8c] [8] func:/opt/ompi/lib/openmpi/mca_pml_ob1.so [0x8000533d9c] [9] func:/opt/ompi/lib/libmpi.so.0 [0x80000d9988] [10] func:/opt/ompi/lib/libmpi.so.0 [0x8000087a80] [11] func:/opt/ompi/lib/libmpi.so.0 [0x80000b09ac] [12] func:./IMB-MPI1 [0x10003328] [13] func:/lib64/tls/libc.so.6 [0x8064e9415c] [14] func:/lib64/tls/libc.so.6 [0x8064e942e4] *** End of error message *** On 5/24/06, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote:
There is a known issue with OpenIB on PPC machines at the moment -- see: https://svn.open-mpi.org/trac/ompi/ticket/23<https://svn.open-mpi.org/trac/ompi/ticket/23> A temporary workaround is to either use the SRQ or disable eager RDMA. See the bug for the details of both of these options. ------------------------------ *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On Behalf Of *Paul *Sent:* Wednesday, May 24, 2006 6:53 PM *To:* us...@open-mpi.org *Subject:* [OMPI users] pallas assistance ? So I have 64bit ppc versions of openmpi, openib and the pallas files (IMB_MP1 being the important one). ldd checks out okay and shows nothing missing. However when I try to execute the pallas run it dies like so: [root@thing ompi]# /opt/ompi/bin/mpirun -np 2 -machinefile machine.list./IMB-MPI1 Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR) Failing at addr:0x3000100a819d [0] func:/opt/ompi/lib/libopal.so.0 [0x80001c6e18] [1] func:[0x1ffffffdfd0] [2] func:/opt/ompi/lib/libmpi.so.0 [0x800006516c] [3] func:/opt/ompi/lib/libmpi.so.0 [0x80000652b4] [4] func:/opt/ompi/lib/openmpi/mca_btl_openib.so [0x800056f2f0] [5] func:/opt/ompi/lib/libmpi.so.0 [0x80000d0540] [6] func:/opt/ompi/lib/openmpi/mca_bml_r2.so [0x80005548a8] [7] func:/opt/ompi/lib/libmpi.so.0 [0x80000cfc8c] [8] func:/opt/ompi/lib/openmpi/mca_pml_ob1.so [0x8000533d9c] [9] func:/opt/ompi/lib/libmpi.so.0 [0x80000d9988] [10] func:/opt/ompi/lib/libmpi.so.0 [0x8000087a80] [11] func:/opt/ompi/lib/libmpi.so.0 [0x80000b09ac] [12] func:./IMB-MPI1 [0x10003328] [13] func:/lib64/tls/libc.so.6 [0x8064e9415c] [14] func:/lib64/tls/libc.so.6 [0x8064e942e4] *** End of error message *** are there any special things that need to be done with pallas, open-ib, open-mpi ? Pallas compiled fine and linked okay with the needed libraries. Currently machine.list is just localhost twice. _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users