On Fri, 10 Jan 2025 at 13:42, Yura Sokolov <y.soko...@postgrespro.ru> wrote: > > BTW, your version could make alike trick for guaranteed atomicity: > - change XLogRecord's `XLogRecPtr xl_prev` to `uint32 xl_prev_offset` > and store offset to prev record's start.
-1, I don't think that is possible without degrading what our current WAL system protects against. For intra-record torn write protection we have the checksum, but that same protection doesn't cover the multiple WAL records on each page. That is what the xl_prev pointer is used for - detecting that this part of the page doesn't contain the correct data (e.g. the data of a previous version of this recycled segment). If we replaced xl_prev with just an offset into the segment, then this protection would be much less effective, as the previous version of the segment realistically used the same segment offsets at the same offsets into the file. To protect against torn writes while still only using record segment offsets, you'd have zero and then fsync any segment before reusing it, which would severely reduce the benefits we get from recycling segments. Note that we can't expect the page header to help here, as write tears can happen at nearly any offset into the page - not just 8k intervals - and so the page header is not always representative of the origins of all bytes on the page - only the first 24 (if even that). Kind regards, Matthias van de Meent