[Bug fortran/25829] [F03] Asynchronous IO support

koenigni at gcc dot gnu.org Mon, 02 Oct 2017 02:46:24 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25829


--- Comment #27 from Nicolas Koenig <koenigni at gcc dot gnu.org> ---
(In reply to Janne Blomqvist from comment #26)
> I though I wrote somewhere why I gave up on this, after thinking a lot about
> the problem in general. However, I can't find my writeup now, so I'll add a
> short version here so that others who are interested in this problem may
> benefit.
> 
> So, to begin with, non-blocking socket I/O is widely used on Linux and works
> well (select(), epoll() etc.). However, here we're talking about file IO,
> not sockets. For file IO, the non-blocking socket programming model doesn't
> work; files are always considered "fast" devices and thus always return
> ready if you try to poll them. Thus, asynchronous I/O. The choices are
> roughly:
> 
> 1) Linux native AIO: syscalls like io_submit() etc. This however works only
> on files opened with O_DIRECT, and all I/O must be 512-byte aligned. So
> clearly this disqualifies this solution for something general purpose like
> Fortran AIO.
> 
> 2) POSIX AIO (aio_read() etc.). This, in principle, could work. Except for
> 1) It uses signals for reporting completions, which is horrible. Also, some
> may consider it bad form if libgfortran uses (limited) signal numbers for
> its internal use, preventing applications from using them.  2) On Linux,
> glibc implements POSIX AIO using a userspace thread pool, with the further
> restriction that only a single outstanding I/O per file descriptor is
> possible (which may or may not matter for Fortran AIO). 
> 
> 3) Do it yourself with a thread pool. Similar to POSIX AIO on Linux/glibc,
> except you can use something more sane than signals for signaling completion
> (e.g. pipes or a pure userspace queue).
> 
> See also e.g. http://blog.libtorrent.org/2012/10/asynchronous-disk-io/
> 
> 
> So, the only solution that has the potential to work well and is portable is
> #3. It's a fair amount of work, though, and in the end I wasn't convinced it
> was worth the effort.

At the moment I only plan on using the normal pthread-API. My Idea for an
algorythm would be something like this:

=> if a unit is opened with the "asynchronous" flag, a new thread is spun up
for this unit.
=> when a TRANSFER_* funktion is called, the buffer and all the other necessary
information is enqueued in a asynchronous work queue. (see below)
=> the thread is notified that work has been added
=> the thread takes care of the io
=> when the unit is closed pthread_join() is called

I plan to enqueue the pdt->transfer() calls with their respective arguments in
the work queue.

I actually already have a small prototype that implements the principal behind
this in c and it works :)

One of the problems I found up until now with this approach is for example the
following code snippet:

program main
    implicit none
    open (10, file='foo.dat', asynchronous='yes')
    call s()
    close(10)
contains
    subroutine s()
        integer, dimension (3)::i !presumably on the stack
        i = [0, 1]
        write(10,*) i
        !Now the stack frame is dropped and the pointer that previously
        !pointed to the array now points to nowhere, but it is still enqueued
    end subroutine
end program

Do you see any fundamental problems with this approach or its integration with
libgfortran?

[Bug fortran/25829] [F03] Asynchronous IO support

Reply via email to