Hi,
Are you using some derived datatypes that are freed (MPI_Type_free)
*before* the non blocking communication completes ?
this is a known issue we are currently working on (but that was already
present in 1.8.5)
can you write and post a simple program that evidences this issue ?
Cheers,
Gilles
On 1/26/2016 2:35 PM, Eva wrote:
openmpi-1.10.2 cores at mca_coll_libnbc.so
My program is transferred from 1.8.5 to 1.10.2. But when I run it, it
cores as below.
Program terminated with signal 11, Segmentation fault.
#0 0x00007fa3550f51d2 in ompi_coll_libnbc_igather () from
/home/work/wuzhihua/install/openmpi-1.10.2rc3-gcc4.8/lib/openmpi/mca_coll_libnbc.so
Missing separate debuginfos, use: debuginfo-install
glibc-2.12-1.80.el6.x86_64 libgcc-4.4.6-4.el6.x86_64
libibumad-1.3.9.MLNX20140817.485ffa6-0.1.x86_64
libibverbs-1.1.7mlnx1-OFED.2.3.124.g4d8b179.x86_64
libnl-1.1-14.el6.x86_64 libstdc++-4.4.6-4.el6.x86_64
opensm-libs-4.2.5.MLNX20140828.7f05469-0.1.x86_64
openssl-1.0.0-20.el6_2.5.x86_64 tcp_wrappers-libs-7.6-57.el6.x86_64
zlib-1.2.3-27.el6.x86_64
(gdb) bt
#0 0x00007fa3550f51d2 in ompi_coll_libnbc_igather () from
/home/install/openmpi-1.10.2rc3-gcc4.8/lib/openmpi/mca_coll_libnbc.so
#1 0x0000000000010202 in ?? ()
#2 0x0000000000000033 in ?? () at
/home/openmpi-1.10.2rc3-gcc4.8/include/openmpi/ompi/mpi/cxx/win_inln.h:42
#3 0x00007fa359faf180 in ?? () from
/home/install/openmpi-1.10.2rc3-gcc4.8/lib/libopen-pal.so.13
#4 0x0000000000000000 in ?? ()
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/01/28373.php