On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier <michael.paqu...@gmail.com> wrote: > On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >>> Here are some comments about your patch after a look at the code. >>> >>> Regarding the additions in fsync_fname() in xlog.c: >>> 1) In InstallXLogFileSegment, rename() will be called only if >>> HAVE_WORKING_LINK is not used, which happens only on Windows and >>> cygwin. We could add it for consistency, but it should be within the >>> #else/#endif block. It is not critical as of now. >>> 2) The call in RemoveXlogFile is not necessary, the rename happening >>> only on Windows. >> >> Hmmm, OK. Are we sure HAVE_WORKING_LINK is false only on Windows, or could >> there be some other platforms? And are we sure the file systems on those >> platforms are safe without the fsync call? >> That is, while the report references ext4, there may be other file systems >> with the same problem - ext4 was used mostly as it's the most widely used >> Linux file system. > > From pg_config_manual.h: > #if !defined(WIN32) && !defined(__CYGWIN__) > #define HAVE_WORKING_LINK 1 > #endif > If we want to be consistent with what Posix proposes, I am not against > adding it.
I did some tests with NTFS using cygwin, and the rename() calls remain even after powering off the VM. But I agree that adding an fsync() in both cases would be fine. >>> Thoughts? >> >> Thanks for the review and comments. I think the question is whether we only >> want to do the additional fsync() only when it ultimately may lead to data >> loss, or even in cases where it may cause operational issues (e.g. switching >> back to recovery needlessly). >> I'd vote for the latter, as I think it makes the database easier to operate >> (less manual interventions) and the performance impact is 0 (as those fsyncs >> are really rare). > > My first line of thoughts after looking at the patch is that I am not > against adding those fsync calls on HEAD as there is roughly an > advantage to not go back to recovery in most cases and ensure > consistent names, but as they do not imply any data loss I would not > encourage a back-patch. Adding them seems harmless at first sight I > agree, but those are not actual bugs. OK. It is true that PGDATA would be fsync'd in 4 code paths with your patch which are not that much taken: - Renaming tablespace map file and backup label file (three times) - Renaming to recovery.done So, what do you think about the patch attached? Moving the calls into the critical sections is not really necessary except when installing a new segment. -- Michael
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c index f6da673..4173a50 100644 --- a/src/backend/access/transam/timeline.c +++ b/src/backend/access/transam/timeline.c @@ -435,6 +435,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, (errcode_for_file_access(), errmsg("could not rename file \"%s\" to \"%s\": %m", tmppath, path))); + + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR, true); #endif /* The history file can be archived immediately. */ @@ -525,6 +531,9 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size) (errcode_for_file_access(), errmsg("could not rename file \"%s\" to \"%s\": %m", tmppath, path))); + + /* Make sure the rename is permanent by fsyncing the directory. */ + fsync_fname(XLOGDIR, true); #endif } diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index a2846c4..b124f90 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3278,6 +3278,14 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath, tmppath, path))); return false; } + + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + START_CRIT_SECTION(); + fsync_fname(XLOGDIR, true); + END_CRIT_SECTION(); #endif if (use_lock) @@ -3800,10 +3808,18 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr) path))); return; } + + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR, true); + rc = unlink(newpath); #else rc = unlink(path); #endif + if (rc != 0) { ereport(LOG, @@ -5297,6 +5313,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog) errmsg("could not rename file \"%s\" to \"%s\": %m", RECOVERY_COMMAND_FILE, RECOVERY_COMMAND_DONE))); + /* Make sure the rename is permanent by fsyncing the data directory. */ + fsync_fname(".", true); + ereport(LOG, (errmsg("archive recovery complete"))); } @@ -6150,6 +6169,12 @@ StartupXLOG(void) TABLESPACE_MAP, BACKUP_LABEL_FILE), errdetail("Could not rename file \"%s\" to \"%s\": %m.", TABLESPACE_MAP, TABLESPACE_MAP_OLD))); + + /* + * Make sure the rename is permanent by fsyncing the data + * directory. + */ + fsync_fname(".", true); } /* @@ -6525,6 +6550,13 @@ StartupXLOG(void) TABLESPACE_MAP, TABLESPACE_MAP_OLD))); } + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + if (haveBackupLabel || haveTblspcMap) + fsync_fname(".", true); + /* Check that the GUCs used to generate the WAL allow recovery */ CheckRequiredParameterValues(); @@ -7305,6 +7337,12 @@ StartupXLOG(void) errmsg("could not rename file \"%s\" to \"%s\": %m", origpath, partialpath))); XLogArchiveNotify(partialfname); + + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR, true); } } } @@ -10905,6 +10943,9 @@ CancelBackup(void) BACKUP_LABEL_FILE, BACKUP_LABEL_OLD, TABLESPACE_MAP, TABLESPACE_MAP_OLD))); } + + /* fsync the data directory to persist the renames */ + fsync_fname(".", true); } /* diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c index 277c14a..8dda80b 100644 --- a/src/backend/access/transam/xlogarchive.c +++ b/src/backend/access/transam/xlogarchive.c @@ -477,6 +477,12 @@ KeepFileRestoredFromArchive(char *path, char *xlogfname) path, xlogfpath))); /* + * Make sure the renames are permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR, true); + + /* * Create .done file forcibly to prevent the restored segment from being * archived again later. */ @@ -586,6 +592,11 @@ XLogArchiveForceDone(const char *xlog) errmsg("could not rename file \"%s\" to \"%s\": %m", archiveReady, archiveDone))); + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR "/archive_status", true); return; } diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 397f802..7165c74 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -733,4 +733,10 @@ pgarch_archiveDone(char *xlog) (errcode_for_file_access(), errmsg("could not rename file \"%s\" to \"%s\": %m", rlogready, rlogdone))); + + /* + * Make sure the rename is permanent by fsyncing the parent + * directory. + */ + fsync_fname(XLOGDIR "/archive_status", true); }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers