Hi,
When I run a parallel program, I got an error :
------------------------------------------------------------------[n333:129522]
*** Process received signal ***[n333:129522] Signal: Segmentation fault
(11)[n333:129522] Signal code: Address not mapped (1)[n333:129522] Failing at
address: 0x40[n333:129522] [ 0] /lib64/libpthread.so.0
[0x3c50e0e4c0][n333:129522] [ 1] /opt/openmpi-1.3.4-gnu/lib/libmpi.so.0
[0x4cd19b1][n333:129522] [ 2]
/opt/openmpi-1.3.4-gnu/lib/libopen-pal.so.0(opal_progress+0x75)
[0x52e5165][n333:129522] [ 3] /opt/openmpi-1.3.4-gnu/lib/libopen-rte.so.0
[0x508565c][n333:129522] [ 4] /opt/openmpi-1.3.4-gnu/lib/libmpi.so.0
[0x4c653eb][n333:129522] [ 5]
/opt/openmpi-1.3.4-gnu/lib/libmpi.so.0(MPI_Init+0x120) [0x4c84b90][n333:129522]
[ 6] /lustre/jxding/netplan49/nsga2b [0x4497f6][n333:129522] [ 7]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x3c5021d974][n333:129522] [ 8]
/lustre/jxding/netplan49/nsga2b(__gxx_personality_v0+0x499)
[0x4436e9][n333:129522] *** End of error message
***--------------------------------------------------------------------------mpirun
has exited due to process rank 24 with PID 129522 onnode n333 exiting without
calling "finalize". This mayhave caused other processes in the application to
beterminated by signals sent by mpirun (as reported
here).-----------------------------------------------------------------------------------------------------------------------------------------------------------------But,
the program only run for not more than a few of minutes. It should take hours
to finish.
How can it reach "finalize" so fast ?
Any help is appreciated.
Jack