On 05/07/2014 03:10 PM, Richard Shaw wrote:
Thanks Rob. I'll keep track of it over there. How often do updated
versions of ROMIO get pulled over from MPICH into OpenMPI?

On a slightly related note, I think I heard that you had fixed the 32bit
issues in ROMIO that were causing it to break when reading more than 2
GB (i.e.
http://www.open-mpi.org/community/lists/users/2012/07/19762.php). Have
those been pulled into OpenMPI? I've been staying clear of ROMIO for a
while (in favour of OMPIO), to avoid those issues.

Looks like I fixed that late last year. A slew of ">31 bit transfers" fixes went into the MPICH-3.1 release. Slurping those changes, which are individually small (using some _x versions of type-inquiry routines here, some MPI_Count promotions there) but pervasive, might give OpenMPI a bit of a headache.

==rob


Thanks,
Richard


On 7 May 2014 12:36, Rob Latham <r...@mcs.anl.gov
<mailto:r...@mcs.anl.gov>> wrote:



    On 05/05/2014 09:20 PM, Richard Shaw wrote:

        Hello,

        I think I've come across a bug when using ROMIO to read in a 2D
        distributed array. I've attached a test case to this email.


    Thanks for the bug report and the test case.

    I've opened MPICH bug (because this is ROMIO's fault, not OpenMPI's
    fault... until I can prove otherwise ! :>)

    http://trac.mpich.org/__projects/mpich/ticket/2089
    <http://trac.mpich.org/projects/mpich/ticket/2089>

    ==rob


        In the testcase I first initialise an array of 25 doubles (which
        will be
        a 5x5 grid), then I create a datatype representing a 5x5 matrix
        distributed in 3x3 blocks over a 2x2 process grid. As a
        reference I use
        MPI_Pack to pull out the block cyclic array elements local to each
        process (which I think is correct). Then I write the original
        array of
        25 doubles to disk, and use MPI-IO to read it back in
        (performing the
        Open, Set_view, and Real_all), and compare to the reference.

        Running this with OMPI, the two match on all ranks.

          > mpirun -mca io ompio -np 4 ./darr_read.x
        === Rank 0 === (9 elements)
        Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
        Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

        === Rank 1 === (6 elements)
        Packed: 15.0 16.0 17.0 20.0 21.0 22.0
        Read:   15.0 16.0 17.0 20.0 21.0 22.0

        === Rank 2 === (6 elements)
        Packed:  3.0  4.0  8.0  9.0 13.0 14.0
        Read:    3.0  4.0  8.0  9.0 13.0 14.0

        === Rank 3 === (4 elements)
        Packed: 18.0 19.0 23.0 24.0
        Read:   18.0 19.0 23.0 24.0



        However, using ROMIO the two differ on two of the ranks:

          > mpirun -mca io romio -np 4 ./darr_read.x
        === Rank 0 === (9 elements)
        Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
        Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

        === Rank 1 === (6 elements)
        Packed: 15.0 16.0 17.0 20.0 21.0 22.0
        Read:    0.0  1.0  2.0  0.0  1.0  2.0

        === Rank 2 === (6 elements)
        Packed:  3.0  4.0  8.0  9.0 13.0 14.0
        Read:    3.0  4.0  8.0  9.0 13.0 14.0

        === Rank 3 === (4 elements)
        Packed: 18.0 19.0 23.0 24.0
        Read:    0.0  1.0  0.0  1.0



        My interpretation is that the behaviour with OMPIO is correct.
        Interestingly everything matches up using both ROMIO and OMPIO
        if I set
        the block shape to 2x2.

        This was run on OS X using 1.8.2a1r31632. I have also run this
        on Linux
        with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO
        I just
        get segfaults.

        Thanks,
        Richard


        _________________________________________________
        users mailing list
        us...@open-mpi.org <mailto:us...@open-mpi.org>
        http://www.open-mpi.org/__mailman/listinfo.cgi/users
        <http://www.open-mpi.org/mailman/listinfo.cgi/users>


    --
    Rob Latham
    Mathematics and Computer Science Division
    Argonne National Lab, IL USA
    _________________________________________________
    users mailing list
    us...@open-mpi.org <mailto:us...@open-mpi.org>
    http://www.open-mpi.org/__mailman/listinfo.cgi/users
    <http://www.open-mpi.org/mailman/listinfo.cgi/users>




_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Reply via email to