gistGetFakeLSN() can return incorrect LSNs

Andres Freund Thu, 05 Mar 2026 09:10:33 -0800

Hi,

Tomas encountered a crash with the index prefetching patchset. One of the
patches included therein is a generalization of the gistGetFakeLSN()
mechanism, which is then used by other indexes as well.  That triggered an
occasional, hard to locally reproduce, ERROR or PANIC in CI, about


  ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 
0/01BD2000


A bunch of debugging later, it turns out that this is a pre-existing issue.

XLogRecPtr
gistGetFakeLSN(Relation rel)
{
...
    else if (RelationIsPermanent(rel))
    {
        /*
         * WAL-logging on this relation will start after commit, so its LSNs
         * must be distinct numbers smaller than the LSN at the next commit.
         * Emit a dummy WAL record if insert-LSN hasn't advanced after the
         * last call.
         */
        static XLogRecPtr lastlsn = InvalidXLogRecPtr;
        XLogRecPtr  currlsn = GetXLogInsertRecPtr();

        /* Shouldn't be called for WAL-logging relations */
        Assert(!RelationNeedsWAL(rel));

        /* No need for an actual record if we already have a distinct LSN */
        if (XLogRecPtrIsValid(lastlsn) && lastlsn == currlsn)
            currlsn = gistXLogAssignLSN();
        lastlsn = currlsn;
        XLogFlush(currlsn);

        return currlsn;
    }


The problem is that GetXLogInsertRecPtr() returns the start of the next
record. That's *most* of the time the same as what XLogInsert() would return
(i.e. one byte past the end of the last record), but not reliably so: If the
last record ends directly at a page boundary, XLogInsert() returns an LSN
directly to the start of the page, but GetXLogInsertRecPtr() will return an
LSN that points to just after the xlog page header.

If you look at the error from above, that's exactly what's happening - the
flush is only up to 0/01BD2000 (0x2000 is 8192, i.e. an 8kB boundary). But the
flush request is to 0/01BD2018, where 0x18 is the size of XLogPageHeaderData.

To be safe, this code would need to use a version of GetXLogInsertRecPtr()
that does use XLogBytePosToEndRecPtr() instead of XLogBytePosToRecPtr().


It's probably not easy to trigger this outside of aggressive test scenarios,
due to needing to avoid the gistXLogAssingLSN() path and having to encounter a
"reason" to flush the buffer immediately after.


However, if I put an XLogFlush() into gistGetFakeLSN() and use
wal_level=minimal, it's a lot easier.


It looks like this was introduced in

commit c6b92041d38
Author: Noah Misch <[email protected]>
Date:   2020-04-04 12:25:34 -0700

    Skip WAL for new relfilenodes, under wal_level=minimal.


Greetings,

Andres Freund

gistGetFakeLSN() can return incorrect LSNs

Reply via email to