svn-role <svn-r...@apache.org> writes: > Merge r1883355 from trunk: > > * r1883355 > Use the APR-1.4+ API for flushing file contents to disk. > Justification: > Reduce code duplication between APR and SVN. > Votes: > +1: brane, jun66j5, markphip … > + do { > + apr_err = apr_file_datasync(file); > + } while(APR_STATUS_IS_EINTR(apr_err)); > + > + /* If the file is in a memory filesystem, fsync() may return > + EINVAL. Presumably the user knows the risks, and we can just > + ignore the error. */ > + if (APR_STATUS_IS_EINVAL(apr_err)) > + return SVN_NO_ERROR; > + > + if (apr_err) > + return svn_error_wrap_apr(apr_err, > + _("Can't flush file '%s' to disk"), > + try_utf8_from_internal_style(fname, pool));
A few thoughts on this change: 1) Previously, the check for EINVAL within the svn_io_file_flush_to_disk() function only happened in the #else block, so it did not affect the behavior on Windows. With the change, the check happens unconditionally on all platforms. On Windows, APR_STATUS_IS_EINVAL() is defined as follows: #define APR_STATUS_IS_EINVAL(s) ((s) == APR_EINVAL \ || (s) == APR_OS_START_SYSERR + ERROR_INVALID_ACCESS \ || (s) == APR_OS_START_SYSERR + ERROR_INVALID_DATA \ || (s) == APR_OS_START_SYSERR + ERROR_INVALID_FUNCTION \ || (s) == APR_OS_START_SYSERR + ERROR_INVALID_HANDLE \ || (s) == APR_OS_START_SYSERR + ERROR_INVALID_PARAMETER \ || (s) == APR_OS_START_SYSERR + ERROR_NEGATIVE_SEEK) So with this change, all of these error codes are now going to be ignored, and the svn_io_file_flush_to_disk() function will return SVN_NO_ERROR, indicating the success of flushing the data to disk. Some of these error codes, such as ERROR_INVALID_HANDLE, are quite generic. I would think that this change opens a data corruption possibility in the following case: - A real error happens during FlushFileBuffers(). - It gets translated into one of the error codes above by the I/O stack. - The error is ignored in the implementation of svn_io_file_flush_to_disk(). - The caller of svn_io_file_flush_to_disk() interprets this situation as a successful data flush. - He continues to work under an assumption that the data was successfully flushed to disk, whereas in fact it was not. For example, by committing a repository transaction. - A power loss after the transaction commit causes the data to be lost or corrupted. 2) Unless I am mistaken, this change replaces an fcntl(F_FULLFSYNC) with fdatasync() when the latter is supported. So assuming that both operations are supported, the new behavior of the svn_io_file_flush_to_disk() function has weaker integrity guarantees than before, because fdatasync() does not ensure that metadata such as the last modification time is written to disk. I haven't done a thorough check if we rely on mtime being flushed to the disk in our code, but even if we don't, svn_io_file_flush_to_disk() is part of the public API, so altering its integrity guarantees in a patch release might be unexpected for its callers. Thoughts? (A small disclaimer: these changes have slipped past my attention until now, when I tried updating to 1.14.2. But better late than sorry…) Thanks, Evgeny Kotkov