On Feb 25, 2009, at 12:25 PM, Ken Mighell wrote:

We are trying to compile the code with Open MPI on a Mac Pro with 2 quad-core Xeons using gfortran.

The code seem to be running ... for the most part. Unfortunately we keep getting a segfault
which spits out a variant of the following message:

[oblix:21522] *** Process received signal ***
[oblix:21522] Signal: Segmentation fault (11)
[oblix:21522] Signal code: Address not mapped (1)
[oblix:21522] Failing at address: 0xc0000710
[oblix:21522] [ 0] 2 libSystem.B.dylib 0x92a892bb _sigtramp + 43 [oblix:21522] [ 1] 3 ??? 0xffffffff 0x0 + 4294967295 [oblix:21522] [ 2] 4 exe.out 0x0001281b MAIN__ + 4875 [oblix:21522] [ 3] 5 exe.out 0x00013c38 main + 40 [oblix:21522] [ 4] 6 exe.out 0x00001936 start + 54
[oblix:21522] *** End of error message ***

After some researching of the error message, and digging around in the Open MPI user's mailing list,
it appears that the bug may be in Open MPI.

I'm not sure what you mean by this -- getting a stack trace out of Open MPI doesn't necessarily mean a bug in Open MPI.

Can you get corefile and look and see what exactly failed? Or run under a debugger to see where/how exactly the process fails? From the stack trace above, it looks like the failure occurs in application code, not Open MPI...?

--
Jeff Squyres
Cisco Systems

Reply via email to