On Tue, Feb 2, 2016 at 1:07 AM, Andres Freund <and...@anarazel.de> wrote: > On 2016-01-25 16:30:47 +0900, Michael Paquier wrote: >> diff --git a/src/backend/access/transam/xlog.c >> b/src/backend/access/transam/xlog.c >> index a2846c4..b124f90 100644 >> --- a/src/backend/access/transam/xlog.c >> +++ b/src/backend/access/transam/xlog.c >> @@ -3278,6 +3278,14 @@ InstallXLogFileSegment(XLogSegNo *segno, char >> *tmppath, >> tmppath, path))); >> return false; >> } >> + >> + /* >> + * Make sure the rename is permanent by fsyncing the parent >> + * directory. >> + */ >> + START_CRIT_SECTION(); >> + fsync_fname(XLOGDIR, true); >> + END_CRIT_SECTION(); >> #endif > > Hm. I'm seriously doubtful that using critical sections for this is a > good idea. What's the scenario you're protecting against by declaring > this one? We intentionally don't error out in the isdir cases in > fsync_fname() cases anyway. > > Afaik we need to fsync tmppath before and after the rename, and only > then the directory, to actually be safe.
Regarding the fsync call on the new file before the rename, would it be better to extend fsync_fname() with some kind of noerror flag to work around the case of a file that does not exist or do you think it is better just to use pg_fsync in this case after getting an fd? Using directly pg_fsync() looks redundant with what fsync_fname does already. >> @@ -5297,6 +5313,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr >> endOfLog) >> errmsg("could not rename file \"%s\" to >> \"%s\": %m", >> RECOVERY_COMMAND_FILE, >> RECOVERY_COMMAND_DONE))); >> >> + /* Make sure the rename is permanent by fsyncing the data directory. */ >> + fsync_fname(".", true); >> + > > Shouldn't RECOVERY_COMMAND_DONE be fsynced first here? Check. >> /* >> @@ -6525,6 +6550,13 @@ StartupXLOG(void) >> >> TABLESPACE_MAP, TABLESPACE_MAP_OLD))); >> } >> >> + /* >> + * Make sure the rename is permanent by fsyncing the parent >> + * directory. >> + */ >> + if (haveBackupLabel || haveTblspcMap) >> + fsync_fname(".", true); >> + > > Isn't that redundant with the haveTblspcMap case above? I am not sure I get your point here. Are you referring to the fact that fsync should be done after each rename in this case? >> /* Check that the GUCs used to generate the WAL allow recovery >> */ >> CheckRequiredParameterValues(); >> >> @@ -7305,6 +7337,12 @@ StartupXLOG(void) >> errmsg("could not rename file >> \"%s\" to \"%s\": %m", >> origpath, >> partialpath))); >> XLogArchiveNotify(partialfname); >> + >> + /* >> + * Make sure the rename is permanent by >> fsyncing the parent >> + * directory. >> + */ >> + fsync_fname(XLOGDIR, true); > > .partial should be fsynced first. Check. >> } >> } >> } >> @@ -10905,6 +10943,9 @@ CancelBackup(void) >> BACKUP_LABEL_FILE, >> BACKUP_LABEL_OLD, >> TABLESPACE_MAP, >> TABLESPACE_MAP_OLD))); >> } >> + >> + /* fsync the data directory to persist the renames */ >> + fsync_fname(".", true); >> } > > Same. Re-check. >> /* >> diff --git a/src/backend/access/transam/xlogarchive.c >> b/src/backend/access/transam/xlogarchive.c >> index 277c14a..8dda80b 100644 >> --- a/src/backend/access/transam/xlogarchive.c >> +++ b/src/backend/access/transam/xlogarchive.c >> @@ -477,6 +477,12 @@ KeepFileRestoredFromArchive(char *path, char *xlogfname) >> path, xlogfpath))); >> >> /* >> + * Make sure the renames are permanent by fsyncing the parent >> + * directory. >> + */ >> + fsync_fname(XLOGDIR, true); > > Afaics the file under the temporary filename has not been fsynced up to > here. Yes, true, the old file... >> + /* >> * Create .done file forcibly to prevent the restored segment from >> being >> * archived again later. >> */ >> @@ -586,6 +592,11 @@ XLogArchiveForceDone(const char *xlog) >> errmsg("could not rename file \"%s\" >> to \"%s\": %m", >> archiveReady, >> archiveDone))); >> >> + /* >> + * Make sure the rename is permanent by fsyncing the parent >> + * directory. >> + */ >> + fsync_fname(XLOGDIR "/archive_status", true); >> return; >> } > > Afaics the AllocateFile() call below is not protected at all, no? Yep. > How about introducing a 'safe_rename()' that does something roughly akin > to: > fsync(oldname); > fsync(fname) || true; > rename(oldfname, fname); > fsync(fname); > fsync(basename(fname)); > > I'd rather have that kind of logic somewhere once, instead of repeated a > dozen times... Not wrong, and this leads to the following: void rename_safe(const char *old, const char *new, bool isdir, int elevel); Controlling elevel is necessary per the multiple code paths that would use it. Some use ERROR, most of them FATAL, and a bit of WARNING. Does that look fine? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers