On Fri, 2 Feb 2024 11:18:18 +0100
Thomas Munro <thomas.mu...@gmail.com> wrote:

> Hi,
> 
> New WAL space is created by renaming a file into place.  Either a
> newly created file with a temporary name or, ideally, a recyclable old
> file with a name derived from an old LSN.  I think there is a data
> loss window between rename() and fsync(parent_directory).  A
> concurrent backend might open(new_name), write(), fdatasync(), and
> then we might lose power before the rename hits the disk.  The data
> itself would survive the crash, but recovery wouldn't be able to find
> and replay it.  That might break the log-before-data rule or forget a
> transaction that has been reported as committed to a client.
> 
> Actual breakage would presumably require really bad luck, and I
> haven't seen this happen or anything, it just occurred to me while
> reading code, and I can't see any existing defences.
> 
> One simple way to address that would be to make XLogFileInitInternal()
> wait for InstallXLogFileSegment() to finish.  It's a little

Or, can we make sure the rename is durable by calling fsync before
returning the fd, as a patch attached here?

Regards,
Yugo Nagata

> pessimistic to do that unconditionally, though, as then you have to
> wait even for rename operations for segment files later than the one
> you're opening, so I thought about how to skip waiting in that case --
> see 0002.  I'm not sure if it's worth worrying about or not.


-- 
Yugo NAGATA <nag...@sraoss.co.jp>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 478377c4a2..862109ca3b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3062,7 +3062,11 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli,
 					 errmsg("could not open file \"%s\": %m", path)));
 	}
 	else
+	{
+		fsync_fname_ext(path, false, false, ERROR);
+		fsync_parent_path(path, ERROR);
 		return fd;
+	}
 
 	/*
 	 * Initialize an empty (all zeroes) segment.  NOTE: it is possible that

Reply via email to