On 2020-02-13 12:52, Michael Paquier wrote:
durable_rename() calls fsync_fname(), so it would be covered by this change.
The other file access calls in there can be handled by normal error
handling, I think.  Is there any specific scenario you have in mind?

The old file flush is handled by your patch, but not the new one if
it exists, and it seems to me that we should handle failures
consistently to reason easier about it, actually as the top of the
function says :)

OK, added in new patch.

Another point that we could consider is if fsync_fname() should have
an option to not trigger an immediate exit when facing a failure.  The
backend has that option thanks to fsync_fname_ext() with its elevel
argument.  Your choice to default to a failure is fine for most cases
because that's what we want.  However, I am questioning if this change
would be surprising for some client applications or not, and if we
should have the option to choose one behavior or the other.

The option in the backend is between panicking and retrying. The old behavior was to always retry but we have learned that that usually doesn't work.

The frontends do neither right now, or at least the error handling is very inconsistent and inscrutable. It would be possible in theory to add a retry option, but that would be a very different patch, and given what we have learned about fsync(), it probably wouldn't be widely useful.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 91bd9bee1de18fbb1842d0223b693cff21eb2ec6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Thu, 20 Feb 2020 09:52:55 +0100
Subject: [PATCH v2] Change client-side fsync_fname() to report errors fatally

Given all we have learned about fsync() error handling in the last few
years, reporting an fsync() error non-fatally is not useful,
unless you don't care much about the file, in which case you probably
don't need to use fsync() in the first place.

Change fsync_fname() and durable_rename() to exit(1) on fsync() errors
other than those that we specifically chose to ignore.

This affects initdb, pg_basebackup, pg_checksums, pg_dump, pg_dumpall,
and pg_rewind.

Discussion: 
https://www.postgresql.org/message-id/flat/d239d1bd-aef0-ca7c-dc0a-da14bdcf0392%402ndquadrant.com
---
 src/common/file_utils.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index 413fe4eeb1..7584c1f2fb 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -51,8 +51,6 @@ static void walkdir(const char *path,
  * fsyncing, and might not have privileges to write at all.
  *
  * serverVersion indicates the version of the server to be fsync'd.
- *
- * Errors are reported but not considered fatal.
  */
 void
 fsync_pgdata(const char *pg_data,
@@ -250,8 +248,8 @@ pre_sync_fname(const char *fname, bool isdir)
  * fsync_fname -- Try to fsync a file or directory
  *
  * Ignores errors trying to open unreadable files, or trying to fsync
- * directories on systems where that isn't allowed/required.  Reports
- * other errors non-fatally.
+ * directories on systems where that isn't allowed/required.  All other errors
+ * are fatal.
  */
 int
 fsync_fname(const char *fname, bool isdir)
@@ -294,9 +292,9 @@ fsync_fname(const char *fname, bool isdir)
         */
        if (returncode != 0 && !(isdir && (errno == EBADF || errno == EINVAL)))
        {
-               pg_log_error("could not fsync file \"%s\": %m", fname);
+               pg_log_fatal("could not fsync file \"%s\": %m", fname);
                (void) close(fd);
-               return -1;
+               exit(EXIT_FAILURE);
        }
 
        (void) close(fd);
@@ -364,9 +362,9 @@ durable_rename(const char *oldfile, const char *newfile)
        {
                if (fsync(fd) != 0)
                {
-                       pg_log_error("could not fsync file \"%s\": %m", 
newfile);
+                       pg_log_fatal("could not fsync file \"%s\": %m", 
newfile);
                        close(fd);
-                       return -1;
+                       exit(EXIT_FAILURE);
                }
                close(fd);
        }
-- 
2.25.0

Reply via email to