Hello, I think I've come across a bug when using ROMIO to read in a 2D distributed array. I've attached a test case to this email.
In the testcase I first initialise an array of 25 doubles (which will be a 5x5 grid), then I create a datatype representing a 5x5 matrix distributed in 3x3 blocks over a 2x2 process grid. As a reference I use MPI_Pack to pull out the block cyclic array elements local to each process (which I think is correct). Then I write the original array of 25 doubles to disk, and use MPI-IO to read it back in (performing the Open, Set_view, and Real_all), and compare to the reference. Running this with OMPI, the two match on all ranks. > mpirun -mca io ompio -np 4 ./darr_read.x === Rank 0 === (9 elements) Packed: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 Read: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 === Rank 1 === (6 elements) Packed: 15.0 16.0 17.0 20.0 21.0 22.0 Read: 15.0 16.0 17.0 20.0 21.0 22.0 === Rank 2 === (6 elements) Packed: 3.0 4.0 8.0 9.0 13.0 14.0 Read: 3.0 4.0 8.0 9.0 13.0 14.0 === Rank 3 === (4 elements) Packed: 18.0 19.0 23.0 24.0 Read: 18.0 19.0 23.0 24.0 However, using ROMIO the two differ on two of the ranks: > mpirun -mca io romio -np 4 ./darr_read.x === Rank 0 === (9 elements) Packed: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 Read: 0.0 1.0 2.0 5.0 6.0 7.0 10.0 11.0 12.0 === Rank 1 === (6 elements) Packed: 15.0 16.0 17.0 20.0 21.0 22.0 Read: 0.0 1.0 2.0 0.0 1.0 2.0 === Rank 2 === (6 elements) Packed: 3.0 4.0 8.0 9.0 13.0 14.0 Read: 3.0 4.0 8.0 9.0 13.0 14.0 === Rank 3 === (4 elements) Packed: 18.0 19.0 23.0 24.0 Read: 0.0 1.0 0.0 1.0 My interpretation is that the behaviour with OMPIO is correct. Interestingly everything matches up using both ROMIO and OMPIO if I set the block shape to 2x2. This was run on OS X using 1.8.2a1r31632. I have also run this on Linux with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO I just get segfaults. Thanks, Richard
#include <stdio.h> #include <stdlib.h> #include <mpi.h> #define NSIDE 5 #define NBLOCK 3 #define NPROC 2 int main(int argc, char *argv[]) { int i, j; int rank, size; int bpos, err; MPI_Datatype darray; MPI_Status status; MPI_File mpi_fh; /* Define array distribution A 2x2 block size works with ROMIO, a 3x3 block size breaks it. */ int distrib[2] = { MPI_DISTRIBUTE_CYCLIC, MPI_DISTRIBUTE_CYCLIC }; int bsize[2] = { NBLOCK, NBLOCK }; int gsize[2] = { NSIDE, NSIDE }; int psize[2] = { NPROC, NPROC }; double data[NSIDE*NSIDE]; double *ldata, *pdata; int tsize, packsize, nelem; FILE * dfile; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); /* Set up type */ MPI_Type_create_darray(size, rank, 2, gsize, distrib, bsize, psize, MPI_ORDER_FORTRAN, MPI_DOUBLE, &darray); MPI_Type_commit(&darray); MPI_Type_size(darray, &tsize); nelem = tsize / sizeof(double); for(i = 0; i < (NSIDE*NSIDE); i++) data[i] = i; if (rank == 0) { dfile = fopen("tmpfile.dat", "w+"); fwrite(data, sizeof(double), (NSIDE*NSIDE), dfile); fclose(dfile); } MPI_Barrier(MPI_COMM_WORLD); /* Allocate buffer */ ldata = (double *)malloc(tsize); pdata = (double *)malloc(tsize); /* Use Pack to pull out array */ bpos = 0; err = MPI_Pack(data, 1, darray, pdata, tsize, &bpos, MPI_COMM_WORLD); MPI_Barrier(MPI_COMM_WORLD); /* Read in array from file. */ MPI_File_open(MPI_COMM_WORLD, "tmpfile.dat", MPI_MODE_RDONLY, MPI_INFO_NULL, &mpi_fh); MPI_File_set_view(mpi_fh, 0, MPI_DOUBLE, darray, "native", MPI_INFO_NULL); MPI_File_read_all(mpi_fh, ldata, nelem, MPI_DOUBLE, &status); MPI_File_close(&mpi_fh); for(i = 0; i < size; i++) { MPI_Barrier(MPI_COMM_WORLD); if(rank == i) { printf("=== Rank %i === (%i elements) \nPacked: ", rank, nelem); for(j = 0; j < nelem; j++) { printf("%4.1f ", pdata[j]); fflush(stdout); } printf("\nRead: "); for(j = 0; j < nelem; j++) { printf("%4.1f ", ldata[j]); fflush(stdout); } printf("\n\n"); fflush(stdout); } } MPI_Finalize(); exit(0); }