We've been struggling with this error for a while, so hoping someone more
knowledgeable can help!

Our java MPI code exits with a segfault during its normal operation, *but
the segfault occurs before our code ever uses MPI functionality like
sending/receiving. *We've removed all message calls and any use of
MPI.COMM_WORLD from the code. The segfault occurs if we call MPI.init(args)
in our code, and does not if we comment that line out. Further vexing us,
the crash doesn't happen at the point of the MPI.init call, but later on in
the program. I don't have an easy-to-run example here because our non-MPI
code is so large and complicated. We have run simpler test programs with
MPI and the segfault does not occur.

We have isolated the line where the segfault occurs. However, if we comment
that out, the program will run longer, but then randomly (but
deterministically) segfault later on in the code. Does anyone have tips on
how to debug this? We have tried several flags with mpirun, but no good
clues.

We have also tried several MPI versions, including stable 1.8.7 and the
most recent 1.8.8rc1


ATTACHED
- config.log from installation
- output from `ompi_info -all`


OUTPUT FROM RUNNING

> mpirun -np 2 java -mx4g FeaturizeDay datadir/ days.txt
...
some normal output from our code
...
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 29646 on node r9n69 exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Attachment: config.log.bz2
Description: BZip2 compressed data

Attachment: ompi_info.txt.bz2
Description: BZip2 compressed data

Reply via email to