Hi, New WAL space is created by renaming a file into place. Either a newly created file with a temporary name or, ideally, a recyclable old file with a name derived from an old LSN. I think there is a data loss window between rename() and fsync(parent_directory). A concurrent backend might open(new_name), write(), fdatasync(), and then we might lose power before the rename hits the disk. The data itself would survive the crash, but recovery wouldn't be able to find and replay it. That might break the log-before-data rule or forget a transaction that has been reported as committed to a client.
Actual breakage would presumably require really bad luck, and I haven't seen this happen or anything, it just occurred to me while reading code, and I can't see any existing defences. One simple way to address that would be to make XLogFileInitInternal() wait for InstallXLogFileSegment() to finish. It's a little pessimistic to do that unconditionally, though, as then you have to wait even for rename operations for segment files later than the one you're opening, so I thought about how to skip waiting in that case -- see 0002. I'm not sure if it's worth worrying about or not.
0001-Fix-InstallXLogFileSegment-concurrency-bug.patch
Description: Binary data
0002-Track-end-of-installed-WAL-space-in-shared-memory.patch
Description: Binary data