Hi, So I have been looking at this on and off for some time now but I've been unable to make much progress on it.
Whatever the bug is, it's causing heap corruption which causes things to segfault before MPI_Init returns. While most of the time it segfaults, sometimes it prints other openmpi errors, and sometimes glibc aborts due to detecting heap corruption. There is probably a threading data race issue here as well. If I run the test program on an idle machine with "taskset 1" (so it only runs on 1 CPU) then the errors go away. I expect this is the reason why the specific error message is not 100% reproducible for me. On Mon, 23 Jan 2017 23:20:23 +0800 YunQiang Su <[email protected]> wrote: > It is quite strange that it won't fail if run with gdb. I guess this is due to gdb slowing down certain threads. Thanks, James
signature.asc
Description: OpenPGP digital signature

