On Mon, Sep 02, 2019 at 05:38:56PM +0900, Michael Paquier wrote: > Thinking wider, don't we have the same problem with wal_sender_timeout > in the case where a sync request takes longer than the time it would > take the backend to terminate the connection?
I have been able to work more on that, and that can indeed happen with wal_sender_timeout. While reviewing the code, I have noticed that there is little point to enable do_sync when fetching WAL segments. This actually led to too many fsyncs done for the plain format as each WAL segment is fsync'd first by walmethods.c, then fsync'd again by fsync_pgdata() in pg_wal/. Attached is an updated patch, which needs to go down to v10. -- Michael
From ec75a75514d9601d1b1c3b929ac1eb870a208aa2 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Tue, 3 Sep 2019 10:48:45 +0900 Subject: [PATCH v2] Delay fsync of pg_basebackup tar format until the end Since the addition of fsync requests in bc34223 to make base backup data consistent on disk once pg_basebackup finishes, each tablespace tar file is individually flushed once completed, with an additional flush of the parent directory when the base backup finishes. While holding a connection to the server, a fsync request taking a long time may cause a failure of the base backup. A recent example is tcp_user_timeout, but wal_sender_timeout can cause similar problems. While reviewing the code, there was a second issue causing too many fsync requests to be done when completing WAL records for the tar or plain format. As recursive fsyncs are done at the end for both formats for everything written, it is fine to disable fsync when fetching or streaming WAL files. Reported-by: Ryohei Takahashi Author: Michael Paquier Reviewed-by: Ryohei Takahashi Discussion: https://postgr.es/m/osbpr01mb4550dae2f8c9502894a45aab82...@osbpr01mb4550.jpnprd01.prod.outlook.com Backpatch-through: 10 --- src/bin/pg_basebackup/pg_basebackup.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 9207109ba3..da475319b6 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -486,15 +486,18 @@ LogStreamerMain(logstreamer_param *param) #endif stream.standby_message_timeout = standby_message_timeout; stream.synchronous = false; - stream.do_sync = do_sync; + /* fsync happens at the end of pg_basebackup for all data */ + stream.do_sync = false; stream.mark_done = true; stream.partial_suffix = NULL; stream.replication_slot = replication_slot; if (format == 'p') - stream.walmethod = CreateWalDirectoryMethod(param->xlog, 0, do_sync); + stream.walmethod = CreateWalDirectoryMethod(param->xlog, 0, + stream.do_sync); else - stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, do_sync); + stream.walmethod = CreateWalTarMethod(param->xlog, compresslevel, + stream.do_sync); if (!ReceiveXlogStream(param->bgconn, &stream)) @@ -1346,9 +1349,10 @@ ReceiveTarFile(PGconn *conn, PGresult *res, int rownum) if (copybuf != NULL) PQfreemem(copybuf); - /* sync the resulting tar file, errors are not considered fatal */ - if (do_sync && strcmp(basedir, "-") != 0) - (void) fsync_fname(filename, false); + /* + * Do not sync the resulting tar file yet, all files are synced once at + * the end. + */ } @@ -2138,9 +2142,9 @@ BaseBackup(void) /* * Make data persistent on disk once backup is completed. For tar format - * once syncing the parent directory is fine, each tar file created per - * tablespace has been already synced. In plain format, all the data of - * the base directory is synced, taking into account all the tablespaces. + * sync the parent directory and all its contents as each tar file was not + * synced after being completed. In plain format, all the data of the + * base directory is synced, taking into account all the tablespaces. * Errors are not considered fatal. */ if (do_sync) @@ -2150,7 +2154,7 @@ BaseBackup(void) if (format == 't') { if (strcmp(basedir, "-") != 0) - (void) fsync_fname(basedir, true); + (void) fsync_dir_recurse(basedir); } else { -- 2.23.0
signature.asc
Description: PGP signature