On Mon, Oct 17, 2011 at 23:52, Janne Blomqvist <blomqvist.ja...@gmail.com> wrote: > On Mon, Oct 17, 2011 at 19:03, Tobias Burnus <bur...@net-b.de> wrote: >> Hi Janne, >> >> On 10/17/2011 05:30 PM, Janne Blomqvist wrote: >>> >>> On Mon, Oct 17, 2011 at 15:49, Tobias Burnus<bur...@net-b.de> wrote: >>>> >>>> This patch adds a call to _commit() on _WIN32 for the FLUSH subroutine >>>> and >>>> the FLUSH statement. It removes the _commit from gfortran's buf_flush. >>> >>> Like I argued in this message >>> http://gcc.gnu.org/ml/fortran/2011-10/msg00094.html, I think this is a gross >>> mistake. >> >> [...] >> >> And I think it is a mistake to not make the data available to other >> processes as it is indicated by the Fortran 2008 standard: >> >> "Execution of a FLUSH statement causes data written to an external le to be >> available to other processes, or causes data placed in an external file by >> means other than Fortran to be available to a READ statement. These actions >> are processor dependent." >> >> Thus, I think it makes sense for FLUSH to call _commit on Windows. > > I'm not actually sure we can draw such conclusions. What we know is > that metadata updates to the directory are delayed. It wouldn't > surprise me if opening a file (which presumably is a atomic operation > in order to avoid race conditions just like on POSIX) forces the > kernel to sync metadata of other handles to the same file.
I did some further googling, and http://stackoverflow.com/questions/2883691/fflush-on-stdout seems to suggest that the my explanation above is roughly what is happening. That is, opening and closing the file in another program, "type" in the link above, flushes the metadata to the directory. Also from the MSDN link referenced there "The only guarantee about a file timestamp is that the file time is correctly reflected when the handle that makes the change is closed. " (which would suggest that the same hold for other file metadata as well, such as the size). Explanation of file caching in Windows: http://msdn.microsoft.com/en-us/library/aa364218%28v=VS.85%29.aspx Some MS presentation about how to make apps behave nicely with SMB: http://mschnlnine.vo.llnwd.net/d1/pdc08/PPTX/ES23.pptx Under the heading of "Platform Support for Metadata Caching": "Metadata caching is best effort and there are very limited consistency guarantees Metadata caches expire after a fixed time " (Though it's not entirely clear if the above "platform support" means only SMB or Windows filesystem semantics in general.) So, I think the picture is essentially: - write() (which is a wrapper around WriteFile(EX)) transfers data and metadata to the system cache (kernel page cache in unix terminology). - However, the metadata is written to the directory lazily, allowing applications that do stat() (or the equivalent Win32 API call(s)) on a pathname to see stale data. This, incidentally, is what gfortran is doing and what caused the issue that led to the introduction of _commit in the first place. - Closing the file, or _commit() (a wrapper around FlushFileBuffers()), or (per the stackoverflow link above) opening/closing the file in another process will force the directory flush immediately. Some investigation into how other language support libraries handle this: - For MS-DOS compatibility (back when OS file caching wasn't that advanced, one presumes), the MS C runtime allows the user to link in an extra object COMMODE.OBJ which makes fflush() also call _commit() recreating the MS-DOS behavior. By default, however, this is not done. Also, there are some nonstandard flags that can be passed to fopen() to indicate that one wants the MS-DOS behavior. - For the MS C++ compiler, the same COMMODE.OBJ linking can be done, and there is some nonstandard extension allowing one to get the fd from a C++ stream and then call _commit on it. By default flush() on a stream does not call _commit/FlushFileBuffers, it just does a WriteFile(). - For .NET, there is the FileStream.Flush() method, which flushes the user-space buffer without calling _commit/FlushFileBuffers. In the latest version of .NET, there is a new FileStream.Flush(bool) method which can be used to call FlushFileBuffers. In previous .NET versions, people used PInvoke (the native code calling interface) to call FlushFileBuffers if needed. - For the runtimes provided with GCC, grepping the source tree shows that libgfortran is the only occurence of _commit. In libjava there is a FlushFileBuffers call, but it's #if 0'ed away. That is, except for libgfortran, no other language runtime by default makes an effort to synchronize the metadata. In conclusion, I still think that my previous patch which got rid of the _commit and the reliance on using stat() via pathname is the correct approach. "dir" might show stale data for the file size, but this seems to be the norm on Windows, and opening the file in another process will give the correct data. So in practice I don't think there will be any problems from getting rid of _commit. -- Janne Blomqvist