On Feb 22, 2006, at 3:29 PM, Aniruddha Shet wrote:
I tried with openmpi-1.1a1r9098.tar.bz2 but still encounter the same
problem.
There is no core being produced. I am sending you whatever output
trace is written. Not sure if the attached trace will allow you to
debug the problem.
I'm not sure I understand the output -- there seem to be 2 stack
traces shown: one for an MPI process and one for mpirun itself (aka
mpiexec and also aka orterun).
I *think* what is happening is that your process is segv'ing:
mpiexec noticed that job rank 0 with PID 17721 on node "piv110"
exited on signal 11.
but that is somehow causing mpirun to segv (!), which shouldn't
happen. But I can see how that might occur if your MPI processes are
dying during startup and the connections to mpirun are only half-formed.
Additionally, the trace from what appears to be PID 17721 isn't too
helpful:
[0] func:./p1 [0x81d362d]
[0] func:./p1 [0x81d362d]
[1] func:/lib/i686/libpthread.so.0 [0x400c40ba]
[2] func:/lib/i686/libc.so.6 [0x40131ee0]
[3] func:./p1(mpiPi_init+0x5c) [0x808b2a6]
[0] func:./p1 [0x81d362d]
[1] func:/lib/i686/libpthread.so.0 [0x400c40ba]
[1] func:/lib/i686/libpthread.so.0 [0x400c40ba]
[2] func:/lib/i686/libc.so.6 [0x40131ee0]
[2] func:/lib/i686/libc.so.6 [0x40131ee0]
[3] func:./p1(mpiPi_init+0x5c) [0x808b2a6]
[3] func:./p1(mpiPi_init+0x5c) [0x808b2a6]
mpiexec noticed that job rank 0 with PID 17721 on node "piv110"
exited on signal
11.
Can you compile your application (and potentially mpiP) with
debugging enabled so that we can see more information?
Also, I'm confused by these statements at the end of your output:
rcp: core*: No such file or directory
rcp: core*: No such file or directory
rcp: core*: No such file or directory
Do you know what that is?
Can you check into why corefiles are not being produced? Check your
shell settings to ensure that corefiles will be produced (e.g., in
bash, "ulimit -c unlimited", in tcsh, "unlimit coredumpsize" -- you
may need to put this in your shell startup files if you're using rsh/
ssh to start processes).
--
{+} Jeff Squyres
{+} The Open MPI Project
{+} http://www.open-mpi.org/