On Thu, Jun 11, 2009 at 05:33:58PM -0400, Greg Fischer wrote:
> I'm attempting to wrap my brain around the MPI I/O mechanisms, and I was
> hoping to find some guidance.  I'm trying to read a file that contains a
> 117-character string, followed by a series records that contain integers and
> reals.  The following code would read it in serial:
> 
> ---
> character(len=117) :: cfx1
> 
> read (nin) cfx1
> do i=1,end_of_file
>   read(nin) integer1,integer2,real1,real2,real3,real4,real5,real6,real7
> enddo
> ---

Please note that raw binary fortran i/o acts nothing like raw binary C
i/o.  What I mean is that you have a fortran read there, and it's
pulling out records from your fortran file, but who knows how much
padding your compiler put between members of one of these records.

> To simplify the problem, I removed the "cfx1" string from the file I'm
> reading, and created an MPI_TYPE_STRUCT as follows:
> 
> ---
>       length( 1 ) = 1
>       length( 2 ) = 2
>       length( 3 ) = 7
>       length( 3 ) = 1
>       disp( 1 ) = 0
>       disp( 2 ) = sizeof( MPI_LB )
>       disp( 3 ) = disp( 2 ) + 2*sizeof(MPI_INTEGER)
>       disp( 4 ) = disp( 3 ) + 7*sizeof(MPI_REAL)
>       type( 1 ) = MPI_LB
>       type( 2 ) = MPI_INTEGER
>       type( 3 ) = MPI_REAL
>       type( 4 ) = MPI_UB
> 
>       call MPI_TYPE_STRUCT( 4, length, disp, type, sptype, ierr )
>       call MPI_TYPE_COMMIT( sptype, ierr )

There's absolutely no guarantee that records line up in memory like
they do in unformatted binary Fortran files.   Fortran could put more,
less, or the same padding between records.

> This almost works.  With some fiddling (I can't seem to make it work right
> now), I'm able to get most of the reals and integers into "sourcepart", but
> something doesn't line up quite correctly.  I've spent a lot of time looking
> at the documentation and tutorials on the web, but haven't found a resource
> that helps me work through this problem.

Yup.  Take into consideration that I'm a shameless C dude, but Fortran
i/o is pure evil!

> Ultimately, the objective will be to allow an arbitrary number of processes
> read this file, with each record being uniquely read by a single process.
> (e.g. process 1 read record 1, process 2 reads record 2, process 1 reads
> record 3, process 2 reads record 4, etc.)
> 
> What's the best way to skin this cat?  Any assistance would be greatly
> appreciated.

Well, you could use something like parallel-netcdf or parallel-HDF5
which does everything you want to do already, with the added advantage
of being a self-describing portable file format that you could
exchange with collaborators or visualize with a whole ecosystem of
netcdf viewers.  

How did you create this file?  I'm kind of surprised you cannot
MPI_FILE_READ  back what you've written.   The MPI-IO library just
provides a wrapper around C system calls, so if you created this file
with fortran, you'll have to read it back with fortran.  

Since you eventually want to do parallel I/O, I'd suggest creating the
file with MPI-IO  (Even if it is MPI_FILE_WRITE from rank 0 or a
single process) as well as reading it back (perhaps with
MPI_FILE_READ_AT_ALL).

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

Reply via email to