On Jan 26, 2009, at 4:57 PM, Ted Yu wrote:
I'm new to this group. I'm trying to implement a parallel quantum
code called "Seqquest".
I'm trying to figure out why there is an error in the implementation
of this code where there is an error:
This job has allocated 2 cpus
Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:(nil)
[0] func:/usr/lib64/openmpi/libopal.so.0 [0x393af21dc5]
[1] func:/lib64/tls/libpthread.so.0 [0x393b80c4f0]
[2] func:/project/source/seqquest/seqquest_source_v261i/
hive_CentOS4.5_parallel/build_261i/quest_ompi.x [0x4f5cfd]
[3] func:/project/source/seqquest/seqquest_source_v261i/
hive_CentOS4.5_parallel/build_261i/quest_ompi.x(rhosave_+0x120)
[0x4f6a8a]
[4] func:/project/source/seqquest/seqquest_source_v261i/
hive_CentOS4.5_parallel/build_261i/quest_ompi.x(MAIN__+0xb710)
[0x431770]
[5] func:/project/source/seqquest/seqquest_source_v261i/
hive_CentOS4.5_parallel/build_261i/quest_ompi.x(main+0xe) [0xa717ee]
[6] func:/lib64/tls/libc.so.6(__libc_start_main+0xdb) [0x393b11c3fb]
[7] func:/project/source/seqquest/seqquest_source_v261i/
hive_CentOS4.5_parallel/build_261i/quest_ompi.x(free+0x3a) [0x425fca]
*** End of error message ***
^@mpiexec: Warning: task 0 died with signal 11 (Segmentation fault).
Trying to debug this code, I noticed that the math library is an
Intel math library, but all of the codes including scalapack and
blacs were compiled using GNU compiler. Will there be compatibility
issues?
There *could* be. Have you tried to compile everything with the GNU
compiler?
You might also try to examine what exactly in free() is going bad --
are you passing a bad address to free? Can you run the code through a
debugger and/or examine corefiles?
--
Jeff Squyres
Cisco Systems