On 05/07/2014 11:36 AM, Rob Latham wrote:


On 05/05/2014 09:20 PM, Richard Shaw wrote:
Hello,

I think I've come across a bug when using ROMIO to read in a 2D
distributed array. I've attached a test case to this email.

Thanks for the bug report and the test case.

I've opened MPICH bug (because this is ROMIO's fault, not OpenMPI's
fault... until I can prove otherwise ! :>)

http://trac.mpich.org/projects/mpich/ticket/2089

Fascinating. I can reproduce the bug with ROMIO from MPICH master.. .because a patch I put in three days ago is not playing nicely with this particular darray.

However, mpich/master from last week deals with this datatype just fine.

But, the ROMIO from mpich/master and the ROMIO in OpenMPI should be pretty much the same as far as darray datatypes are concerned, so I don't know what the heck is going on... yet.

==rob


==rob


In the testcase I first initialise an array of 25 doubles (which will be
a 5x5 grid), then I create a datatype representing a 5x5 matrix
distributed in 3x3 blocks over a 2x2 process grid. As a reference I use
MPI_Pack to pull out the block cyclic array elements local to each
process (which I think is correct). Then I write the original array of
25 doubles to disk, and use MPI-IO to read it back in (performing the
Open, Set_view, and Real_all), and compare to the reference.

Running this with OMPI, the two match on all ranks.

 > mpirun -mca io ompio -np 4 ./darr_read.x
=== Rank 0 === (9 elements)
Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

=== Rank 1 === (6 elements)
Packed: 15.0 16.0 17.0 20.0 21.0 22.0
Read:   15.0 16.0 17.0 20.0 21.0 22.0

=== Rank 2 === (6 elements)
Packed:  3.0  4.0  8.0  9.0 13.0 14.0
Read:    3.0  4.0  8.0  9.0 13.0 14.0

=== Rank 3 === (4 elements)
Packed: 18.0 19.0 23.0 24.0
Read:   18.0 19.0 23.0 24.0



However, using ROMIO the two differ on two of the ranks:

 > mpirun -mca io romio -np 4 ./darr_read.x
=== Rank 0 === (9 elements)
Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

=== Rank 1 === (6 elements)
Packed: 15.0 16.0 17.0 20.0 21.0 22.0
Read:    0.0  1.0  2.0  0.0  1.0  2.0

=== Rank 2 === (6 elements)
Packed:  3.0  4.0  8.0  9.0 13.0 14.0
Read:    3.0  4.0  8.0  9.0 13.0 14.0

=== Rank 3 === (4 elements)
Packed: 18.0 19.0 23.0 24.0
Read:    0.0  1.0  0.0  1.0



My interpretation is that the behaviour with OMPIO is correct.
Interestingly everything matches up using both ROMIO and OMPIO if I set
the block shape to 2x2.

This was run on OS X using 1.8.2a1r31632. I have also run this on Linux
with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO I just
get segfaults.

Thanks,
Richard


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Reply via email to