Did you try to follow the advice on the LAPACK mailing list, i.e.
upgrade your compiler from the MAC OS X default (4.0.1) to 4.3.0 ?
Btw, what is the test you're running? Can you create a small test case
so I can try to reproduce it?
Thanks,
george.
On Jun 11, 2009, at 17:02 , Nick Collier wrote:
Hi,
I'm developing under OSX 10.5.7 with Open-MPI 1.3.2 and am running
into intermittent corruption when send / recv user defined data
type. When running with less than four processes (i.e. mpirun -np
[2,3]), the data is fine, when running with 4 or more the received
data is intermittently corrupted. By corrupted, I mean things like
what should be small integer values in a struct are very large as if
the memory hasn't been assigned properly. This occurs intermittently
-- some runs will be fine and others won't be, leading to crashes
like:
[belafonte:30191] *** Process received signal ***
[belafonte:30191] Signal: Bus error (10)
[belafonte:30191] Signal code: (2)
[belafonte:30191] Failing at address: 0x9
[belafonte:30191] [ 0] 2 libSystem.B.dylib
0x945af2bb _sigtramp + 43
[belafonte:30191] [ 1] 3 ???
0xffffffff 0x0 + 4294967295
I'm not sure how to proceed or what might be wrong. The closest
thing I could find on google was http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=2&t=614
where someone reports having issues with ScaLapack in combination
with openmpi and OSX's stock gcc 4.01 that were fixed by using gcc
4.3.1.
At any rate, any suggestions on how to move forward would be
appreciated.
thanks,
Nick
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users