On Fri, 2 Feb 2024 11:18:18 +0100 Thomas Munro <thomas.mu...@gmail.com> wrote:
> Hi, > > New WAL space is created by renaming a file into place. Either a > newly created file with a temporary name or, ideally, a recyclable old > file with a name derived from an old LSN. I think there is a data > loss window between rename() and fsync(parent_directory). A > concurrent backend might open(new_name), write(), fdatasync(), and > then we might lose power before the rename hits the disk. The data > itself would survive the crash, but recovery wouldn't be able to find > and replay it. That might break the log-before-data rule or forget a > transaction that has been reported as committed to a client. > > Actual breakage would presumably require really bad luck, and I > haven't seen this happen or anything, it just occurred to me while > reading code, and I can't see any existing defences. > > One simple way to address that would be to make XLogFileInitInternal() > wait for InstallXLogFileSegment() to finish. It's a little Or, can we make sure the rename is durable by calling fsync before returning the fd, as a patch attached here? Regards, Yugo Nagata > pessimistic to do that unconditionally, though, as then you have to > wait even for rename operations for segment files later than the one > you're opening, so I thought about how to skip waiting in that case -- > see 0002. I'm not sure if it's worth worrying about or not. -- Yugo NAGATA <nag...@sraoss.co.jp>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 478377c4a2..862109ca3b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3062,7 +3062,11 @@ XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli, errmsg("could not open file \"%s\": %m", path))); } else + { + fsync_fname_ext(path, false, false, ERROR); + fsync_parent_path(path, ERROR); return fd; + } /* * Initialize an empty (all zeroes) segment. NOTE: it is possible that