On Fri, Jan 22, 2016 at 9:32 PM, Michael Paquier
<michael.paqu...@gmail.com> wrote:
> On Fri, Jan 22, 2016 at 5:26 PM, Tomas Vondra
> <tomas.von...@2ndquadrant.com> wrote:
>> On 01/22/2016 06:45 AM, Michael Paquier wrote:
>>> Here are some comments about your patch after a look at the code.
>>>
>>> Regarding the additions in fsync_fname() in xlog.c:
>>> 1) In InstallXLogFileSegment, rename() will be called only if
>>> HAVE_WORKING_LINK is not used, which happens only on Windows and
>>> cygwin. We could add it for consistency, but it should be within the
>>> #else/#endif block. It is not critical as of now.
>>> 2) The call in RemoveXlogFile is not necessary, the rename happening
>>> only on Windows.
>>
>> Hmmm, OK. Are we sure HAVE_WORKING_LINK is false only on Windows, or could
>> there be some other platforms? And are we sure the file systems on those
>> platforms are safe without the fsync call?
>> That is, while the report references ext4, there may be other file systems
>> with the same problem - ext4 was used mostly as it's the most widely used
>> Linux file system.
>
> From pg_config_manual.h:
> #if !defined(WIN32) && !defined(__CYGWIN__)
> #define HAVE_WORKING_LINK 1
> #endif
> If we want to be consistent with what Posix proposes, I am not against
> adding it.

I did some tests with NTFS using cygwin, and the rename() calls remain
even after powering off the VM. But I agree that adding an fsync() in
both cases would be fine.

>>> Thoughts?
>>
>> Thanks for the review and comments. I think the question is whether we only
>> want to do the additional fsync() only when it ultimately may lead to data
>> loss, or even in cases where it may cause operational issues (e.g. switching
>> back to recovery needlessly).
>> I'd vote for the latter, as I think it makes the database easier to operate
>> (less manual interventions) and the performance impact is 0 (as those fsyncs
>> are really rare).
>
> My first line of thoughts after looking at the patch is that I am not
> against adding those fsync calls on HEAD as there is roughly an
> advantage to not go back to recovery in most cases and ensure
> consistent names, but as they do not imply any data loss I would not
> encourage a back-patch. Adding them seems harmless at first sight I
> agree, but those are not actual bugs.

OK. It is true that PGDATA would be fsync'd in 4 code paths with your
patch which are not that much taken:
- Renaming tablespace map file and backup label file (three times)
- Renaming to recovery.done
So, what do you think about the patch attached? Moving the calls into
the critical sections is not really necessary except when installing a
new segment.
-- 
Michael
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index f6da673..4173a50 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -435,6 +435,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
 						tmppath, path)));
+
+	/*
+	 * Make sure the rename is permanent by fsyncing the parent
+	 * directory.
+	 */
+	fsync_fname(XLOGDIR, true);
 #endif
 
 	/* The history file can be archived immediately. */
@@ -525,6 +531,9 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
 				(errcode_for_file_access(),
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
 						tmppath, path)));
+
+	/* Make sure the rename is permanent by fsyncing the directory. */
+	fsync_fname(XLOGDIR, true);
 #endif
 }
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a2846c4..b124f90 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3278,6 +3278,14 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 						tmppath, path)));
 		return false;
 	}
+
+	/*
+	 * Make sure the rename is permanent by fsyncing the parent
+	 * directory.
+	 */
+	START_CRIT_SECTION();
+	fsync_fname(XLOGDIR, true);
+	END_CRIT_SECTION();
 #endif
 
 	if (use_lock)
@@ -3800,10 +3808,18 @@ RemoveXlogFile(const char *segname, XLogRecPtr PriorRedoPtr, XLogRecPtr endptr)
 					  path)));
 			return;
 		}
+
+		/*
+		 * Make sure the rename is permanent by fsyncing the parent
+		 * directory.
+		 */
+		fsync_fname(XLOGDIR, true);
+
 		rc = unlink(newpath);
 #else
 		rc = unlink(path);
 #endif
+
 		if (rc != 0)
 		{
 			ereport(LOG,
@@ -5297,6 +5313,9 @@ exitArchiveRecovery(TimeLineID endTLI, XLogRecPtr endOfLog)
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
 						RECOVERY_COMMAND_FILE, RECOVERY_COMMAND_DONE)));
 
+	/* Make sure the rename is permanent by fsyncing the data directory. */
+	fsync_fname(".", true);
+
 	ereport(LOG,
 			(errmsg("archive recovery complete")));
 }
@@ -6150,6 +6169,12 @@ StartupXLOG(void)
 								TABLESPACE_MAP, BACKUP_LABEL_FILE),
 						 errdetail("Could not rename file \"%s\" to \"%s\": %m.",
 								   TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
+
+			/*
+			 * Make sure the rename is permanent by fsyncing the data
+			 * directory.
+			 */
+			fsync_fname(".", true);
 		}
 
 		/*
@@ -6525,6 +6550,13 @@ StartupXLOG(void)
 								TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
 		}
 
+		/*
+		 * Make sure the rename is permanent by fsyncing the parent
+		 * directory.
+		 */
+		if (haveBackupLabel || haveTblspcMap)
+			fsync_fname(".", true);
+
 		/* Check that the GUCs used to generate the WAL allow recovery */
 		CheckRequiredParameterValues();
 
@@ -7305,6 +7337,12 @@ StartupXLOG(void)
 						 errmsg("could not rename file \"%s\" to \"%s\": %m",
 								origpath, partialpath)));
 				XLogArchiveNotify(partialfname);
+
+				/*
+				 * Make sure the rename is permanent by fsyncing the parent
+				 * directory.
+				 */
+				fsync_fname(XLOGDIR, true);
 			}
 		}
 	}
@@ -10905,6 +10943,9 @@ CancelBackup(void)
 						   BACKUP_LABEL_FILE, BACKUP_LABEL_OLD,
 						   TABLESPACE_MAP, TABLESPACE_MAP_OLD)));
 	}
+
+	/* fsync the data directory to persist the renames */
+	fsync_fname(".", true);
 }
 
 /*
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 277c14a..8dda80b 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -477,6 +477,12 @@ KeepFileRestoredFromArchive(char *path, char *xlogfname)
 						path, xlogfpath)));
 
 	/*
+	 * Make sure the renames are permanent by fsyncing the parent
+	 * directory.
+	 */
+	fsync_fname(XLOGDIR, true);
+
+	/*
 	 * Create .done file forcibly to prevent the restored segment from being
 	 * archived again later.
 	 */
@@ -586,6 +592,11 @@ XLogArchiveForceDone(const char *xlog)
 					 errmsg("could not rename file \"%s\" to \"%s\": %m",
 							archiveReady, archiveDone)));
 
+		/*
+		 * Make sure the rename is permanent by fsyncing the parent
+		 * directory.
+		 */
+		fsync_fname(XLOGDIR "/archive_status", true);
 		return;
 	}
 
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 397f802..7165c74 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -733,4 +733,10 @@ pgarch_archiveDone(char *xlog)
 				(errcode_for_file_access(),
 				 errmsg("could not rename file \"%s\" to \"%s\": %m",
 						rlogready, rlogdone)));
+
+	/*
+	 * Make sure the rename is permanent by fsyncing the parent
+	 * directory.
+	 */
+	fsync_fname(XLOGDIR "/archive_status", true);
 }
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to