Nate,
a similar issue has already been reported at
https://github.com/open-mpi/ompi/issues/369, but we have
not yet been able to figure out what is going wrong.
right after MPI_Init(), can you add
Thread.sleep(5000);
and see if it helps ?
Cheers,
Gilles
On 8/4/2015 8:36 AM, Nate Chambers wrote:
We've been struggling with this error for a while, so hoping someone
more knowledgeable can help!
Our java MPI code exits with a segfault during its normal operation,
*but the segfault occurs before our code ever uses MPI functionality
like sending/receiving. *We've removed all message calls and any use
of MPI.COMM_WORLD from the code. The segfault occurs if we call
MPI.init(args) in our code, and does not if we comment that line out.
Further vexing us, the crash doesn't happen at the point of the
MPI.init call, but later on in the program. I don't have an
easy-to-run example here because our non-MPI code is so large and
complicated. We have run simpler test programs with MPI and the
segfault does not occur.
We have isolated the line where the segfault occurs. However, if we
comment that out, the program will run longer, but then randomly (but
deterministically) segfault later on in the code. Does anyone have
tips on how to debug this? We have tried several flags with mpirun,
but no good clues.
We have also tried several MPI versions, including stable 1.8.7 and
the most recent 1.8.8rc1
ATTACHED
- config.log from installation
- output from `ompi_info -all`
OUTPUT FROM RUNNING
> mpirun -np 2 java -mx4g FeaturizeDay datadir/ days.txt
...
some normal output from our code
...
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 29646 on node r9n69 exited
on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/08/27386.php