On Mar 13, 2012, at 2:54 PM, Joshua Baker-LePain wrote:
> On Tue, 13 Mar 2012 at 7:53pm, Gutierrez, Samuel K wrote
>
>> The failure signature isn't exactly what we were seeing here at LANL, but
>> there were misplaced memory barriers in Open MPI 1.4.3. Ticket 2619 talks
>> about this issue (https://svn.open-mpi.org/trac/ompi/ticket/2619). This
>> doesn't explain, however, the failures that you are experiencing within Open
>> MPI 1.5.4. Can you give 1.4.4 a whirl and see if this fixes the issue?
>
> Would it be best to use 1.4.4 specifically, or simply the most recent 1.4.x
> (which appears to be 1.4.5 at this point)?
Good point - please do use Open MPI 1.4.5.
>
>> Any more information surrounding your failures in 1.5.4 are greatly
>> appreciated.
>
> I'm happy to provide, but what exactly are you looking for? The test code
> I'm running is *very* simple:
If you experience this type of failure with 1.4.5, can you send another
backtrace? We'll go from there.
Another question. How reproducible is this on your system?
Thanks,
Sam
>
> #include <stdio.h>
> #include <mpi.h>
>
> main(int argc, char **argv)
> {
> int node;
>
> int i, j;
> float f;
>
> MPI_Init(&argc,&argv);
> MPI_Comm_rank(MPI_COMM_WORLD, &node);
>
> printf("Hello World from Node %d.\n", node);
>
> for(i=0; i<=1000000000000; i++)
> f=i*2.718281828*i+i+i*3.141592654;
>
> MPI_Finalize();
> }
>
> And my environment is a pretty standard CentOS-6.2 install.
>
> --
> Joshua Baker-LePain
> QB3 Shared Cluster Sysadmin
> UCSF
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users