Hello,

I think I've come across a bug when using ROMIO to read in a 2D distributed
array. I've attached a test case to this email.

In the testcase I first initialise an array of 25 doubles (which will be a
5x5 grid), then I create a datatype representing a 5x5 matrix distributed
in 3x3 blocks over a 2x2 process grid. As a reference I use MPI_Pack to
pull out the block cyclic array elements local to each process (which I
think is correct). Then I write the original array of 25 doubles to disk,
and use MPI-IO to read it back in (performing the Open, Set_view, and
Real_all), and compare to the reference.

Running this with OMPI, the two match on all ranks.

> mpirun -mca io ompio -np 4 ./darr_read.x
=== Rank 0 === (9 elements)
Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

=== Rank 1 === (6 elements)
Packed: 15.0 16.0 17.0 20.0 21.0 22.0
Read:   15.0 16.0 17.0 20.0 21.0 22.0

=== Rank 2 === (6 elements)
Packed:  3.0  4.0  8.0  9.0 13.0 14.0
Read:    3.0  4.0  8.0  9.0 13.0 14.0

=== Rank 3 === (4 elements)
Packed: 18.0 19.0 23.0 24.0
Read:   18.0 19.0 23.0 24.0



However, using ROMIO the two differ on two of the ranks:

> mpirun -mca io romio -np 4 ./darr_read.x
=== Rank 0 === (9 elements)
Packed:  0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0
Read:    0.0  1.0  2.0  5.0  6.0  7.0 10.0 11.0 12.0

=== Rank 1 === (6 elements)
Packed: 15.0 16.0 17.0 20.0 21.0 22.0
Read:    0.0  1.0  2.0  0.0  1.0  2.0

=== Rank 2 === (6 elements)
Packed:  3.0  4.0  8.0  9.0 13.0 14.0
Read:    3.0  4.0  8.0  9.0 13.0 14.0

=== Rank 3 === (4 elements)
Packed: 18.0 19.0 23.0 24.0
Read:    0.0  1.0  0.0  1.0



My interpretation is that the behaviour with OMPIO is correct.
Interestingly everything matches up using both ROMIO and OMPIO if I set the
block shape to 2x2.

This was run on OS X using 1.8.2a1r31632. I have also run this on Linux
with OpenMPI 1.7.4, and OMPIO is still correct, but using ROMIO I just get
segfaults.

Thanks,
Richard
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define NSIDE 5
#define NBLOCK 3
#define NPROC 2
 
int main(int argc, char *argv[]) 
{ 
  int i, j; 

  int rank, size;
  int bpos, err;

  MPI_Datatype darray;
  MPI_Status status;
  MPI_File mpi_fh;

  /* Define array distribution
      A 2x2 block size works with ROMIO, a 3x3 block size breaks it. */
  int distrib[2] = { MPI_DISTRIBUTE_CYCLIC, MPI_DISTRIBUTE_CYCLIC };
  int bsize[2] = { NBLOCK, NBLOCK };
  int gsize[2] = { NSIDE, NSIDE };
  int psize[2] = { NPROC, NPROC };

  double data[NSIDE*NSIDE];
  double *ldata, *pdata;

  int tsize, packsize, nelem;

  FILE * dfile;
 
  MPI_Init(&argc, &argv);

  MPI_Comm_size(MPI_COMM_WORLD, &size);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  /* Set up type */
  MPI_Type_create_darray(size, rank, 2, gsize, distrib,
			 bsize, psize, MPI_ORDER_FORTRAN, MPI_DOUBLE, &darray);
  MPI_Type_commit(&darray);
  MPI_Type_size(darray, &tsize);
  nelem = tsize / sizeof(double);

  for(i = 0; i < (NSIDE*NSIDE); i++) data[i] = i;

  if (rank == 0) {
    dfile = fopen("tmpfile.dat", "w+");
    fwrite(data, sizeof(double), (NSIDE*NSIDE), dfile);
    fclose(dfile);
  }
  MPI_Barrier(MPI_COMM_WORLD);

  /* Allocate buffer */
  ldata = (double *)malloc(tsize);
  pdata = (double *)malloc(tsize);

  /* Use Pack to pull out array */
  bpos = 0;
  err = MPI_Pack(data, 1, darray, pdata, tsize, &bpos, MPI_COMM_WORLD);

  MPI_Barrier(MPI_COMM_WORLD);

  /* Read in array from file.  */
  MPI_File_open(MPI_COMM_WORLD, "tmpfile.dat", MPI_MODE_RDONLY, MPI_INFO_NULL, &mpi_fh);
  MPI_File_set_view(mpi_fh, 0, MPI_DOUBLE, darray, "native", MPI_INFO_NULL);
  MPI_File_read_all(mpi_fh, ldata, nelem, MPI_DOUBLE, &status);
  MPI_File_close(&mpi_fh);

  for(i = 0; i < size; i++) {
    MPI_Barrier(MPI_COMM_WORLD);
    if(rank == i) {
      printf("=== Rank %i === (%i elements) \nPacked: ", rank, nelem);
      for(j = 0; j < nelem; j++) {
        printf("%4.1f ", pdata[j]);
        fflush(stdout);
      }
      printf("\nRead:   ");
      for(j = 0; j < nelem; j++) {
        printf("%4.1f ", ldata[j]);
        fflush(stdout);
      }
      printf("\n\n");
      fflush(stdout);
    }
  }

  MPI_Finalize();

  exit(0);

} 

Reply via email to