Hello,

I'm trying to run the "connectivity_c" test on a variety of systems using 
OpenMPI 1.8.4. The test returns segmentation faults when running across nodes 
on one particular type of system, and only when using the openib BTL. (The test 
runs without error if I stipulate "--mca btl tcp,self".) Here's the output:

1033 fischega@bl1415[~/tmp/openmpi/1.8.4_test_examples_SLES11_SP2/error]> 
mpirun -np 16 connectivity_c
[bl1415:29526] *** Process received signal ***
[bl1415:29526] Signal: Segmentation fault (11)
[bl1415:29526] Signal code:  (128)
[bl1415:29526] Failing at address: (nil)
[bl1415:29526] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x2ab1e72915d0]
[bl1415:29526] [ 1] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_int_malloc+0x29e)[0x2ab1e7c550be]
[bl1415:29526] [ 2] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_int_memalign+0x69)[0x2ab1e7c58829]
[bl1415:29526] [ 3] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_memalign+0x6f)[0x2ab1e7c583ff]
[bl1415:29526] [ 4] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/openmpi/mca_btl_openib.so(+0x2867b)[0x2ab1eac8a67b]
[bl1415:29526] [ 5] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/openmpi/mca_btl_openib.so(+0x1f712)[0x2ab1eac81712]
[bl1415:29526] [ 6] /lib64/libpthread.so.0(+0x75f0)[0x2ab1e72895f0]
[bl1415:29526] [ 7] /lib64/libc.so.6(clone+0x6d)[0x2ab1e757484d]
[bl1415:29526] *** End of error message ***

When I run the same test using a previous build of OpenMPI 1.6.5 on this 
system, it returns a memory registration warning, but otherwise executes 
normally:

--------------------------------------------------------------------------
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

OpenMPI 1.8.4 does not seem to be reporting a memory registration warning in 
situations where previous versions would report such a warning. Is this because 
OpenMPI 1.8.4 is no longer vulnerable to this type of condition?

Thanks,
Greg

________________________________
This e-mail may contain proprietary information of the sending organization. 
Any unauthorized or improper disclosure, copying, distribution, or use of the 
contents of this e-mail and attached document(s) is prohibited. The information 
contained in this e-mail and attached document(s) is intended only for the 
personal and private use of the recipient(s) named above. If you have received 
this communication in error, please notify the sender immediately by email and 
delete the original e-mail and attached document(s).

Reply via email to