I originally thought that it was an issue related to 32-bit
executables, but it seems to affect 64-bit as well...

I found references to this problem -- it was reported back in 2007:

http://lists.mcs.anl.gov/pipermail/mpich-discuss/2007-July/002600.html


If you look at the code, you will find that MPI_File_read() calls the
special I/O driver implementation if one's available, but if not then
there's also the generic ad_ufs device (POSIX) implementation.

IIRC, SciNet is using IBM GPFS (BTW, a few years ago when Chris gave
me a tour of the machine room at MP, the cluster he was managing was
using Lustre). Since there is no specific implementation for GPFS,
then ROMIO would default back to ad_ufs, and calls
ADIOI_GEN_ReadContig().

In ADIOI_GEN_ReadContig(), we have code:

ADIO_Offset len;

len  = (ADIO_Offset)datatype_size * (ADIO_Offset)count;

And ADIO_Offset is typdef'ed to MPI_Offset, which is 64-bit on 64-bit.
So far so good.


However, the way len is used... hmm, can be an issue:

    ADIOI_Assert(len == (unsigned int) len); /* read takes an unsigned
int parm */

    ...

    err = read(fd->fd_sys, buf, (unsigned int)len);


So wait... read takes an unsigned int?? From the manpage:

       ssize_t read(int fd, void *buf, size_t count);

size_t is not unsigned int... it could be if it is 32-bit, but not
when we are LP64.


Other places in ompi/mca/io/romio/romio/mpi-io/read.c also need to be
updated (those are really easy as they are sanity checks). But at
least someone can try to fix the root cause by changing 2 lines of
code mentioned above, or the ROMIO guys can comment on why an unsigned
int should be passed to read(2)... (Internally, the file offset
(fp_sys_posn) is of type ADIO_Offset, so it should be fine.)

However, I've only spent less than 2 hours on this as I found it
interesting -- 12 years ago I was fixing 32-bit file offset issues in
a supercomputer middleware company, and there are still issues with
32-bit vs 64-bit file pointers today! :-O So I guess 30 years from now
when we run out of space of 64-bit, we will be fixing 32-bit, 64-bit
offset issues for 128-bit applications (that's when we have quantum
computers!)! :-D . Also take the suggestions above at your own risk!
(And I still need to read the "An Abstract-Device Interface for
Implementing Portable Parallel-I/O Interfaces" to understand more
about the internal structures of ROMIO!)

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/


On Tue, Aug 7, 2012 at 6:02 PM, Richard Shaw <jr...@cita.utoronto.ca> wrote:
> On Tuesday, 7 August, 2012 at 12:21 PM, Rob Latham wrote:
>> Hi. Known problem in the ROMIO MPI-IO implementation (which OpenMPI
>> uses). Been on my list of "things to fix" for a while.
>
> Ok, thanks. I'm glad it's not just us.
>
> Is there a timescale for this being fixed? Because if it's a long term thing, 
> I would suggest it might be worth putting a FAQ entry on it or something 
> similar? Especially as it's quite contradictory to most peoples 
> interpretation of the specification. Maybe it's already listed as a known 
> problem somewhere, and I just missed it - it took quite a while before I 
> stopped thinking it was an issue with my code.
>
> Is there a better workaround than just splitting the MPI_File_read up into 
> multiple reads of  <2^31 bytes? We're actually trying to read in a 
> distributed array, and the workaround awkwardly requires the creation and 
> reading of multiple darray types, each designed to read in the correct number 
> of blocks less than 2^31 bytes. This seems like it could be a bit fragile.
>
> Thanks again,
> Richard
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

http://blogs.scalablelogic.com/

Reply via email to