Attached are two patches, each of which fixes two historical buglets around VACUUM's approach to setting bits in the visibility map. (Whether or not this is actually refactoring work or hardening work is debatable, I suppose.)
The first patch makes sure that the snapshotConflictHorizon cutoff (XID cutoff for recovery conflicts) is never a special XID, unless that XID is InvalidTransactionId, which is interpreted as a snapshotConflictHorizon value that will never need a recovery conflict (per the general convention for snapshotConflictHorizon values explained above ResolveRecoveryConflictWithSnapshot). This patch establishes a hard rule that snapshotConflictHorizon values can never be a special XID value, unless it's InvalidTransactionId. An assertion enforces the rule for us in REDO routines (at the point that they call ResolveRecoveryConflictWithSnapshot with the WAL record's snapshotConflictHorizon XID value). The second patch makes sure that VACUUM can never set a page all-frozen in the visibility map without also setting the same page all-visible in the same call to visibilitymap_set() -- regardless of what we think we know about the current state of the all-visible bit in the VM. The second patch adjusts one of the visibilitymap_set() calls in vacuumlazy.c that would previously sometimes set a page's all-frozen bit without also setting its all-visible bit. This could allow VACUUM to leave a page all-frozen but not all-visible in the visibility map (since the value of all_visible_according_to_vm can go stale). I think that this should be treated as a basic visibility map invariant: an all-frozen page must also be all-visible, by definition, so why should it be physically possible for the VM to give a contradictory picture of the all_visible/all_frozen status of any one page? Assertions are added that more or less make this rule into an invariant. amcheck/pg_visibility coverage might make sense too, but I haven't done that here. The second patch also adjusts a later visibilitymap_set() call site (the one used just after heap vacuuming runs in the final heap pass) in roughly the same way. It no longer reads from the visibility map to see what bits need to be changed. The existing approach here seems rather odd. The whole point of calling lazy_vacuum_heap_page() is to set LP_DEAD items referenced by VACUUM's dead_items array to LP_UNUSED -- there has to have been at least one LP_DEAD item on the page for us to end up here (which a Postgres 14 era assertion verifies for us). So we already know perfectly well that the visibility map shouldn't indicate that the page is all-visible yet -- why bother asking the VM? And besides, any call to visibilitymap_set() will only modify the VM when it directly observes that the bits have changed -- so why even attempt to duplicate that on the caller side? It seems to me that the visibilitymap_get_status() call inside lazy_vacuum_heap_page() is actually abused to work as a substitute for visibilitymap_pin(). Why not use top-level visibilitymap_pin() calls instead, just like we do it in the first heap pass? That's how it's done in the second patch; it adds a visibilitymap_pin() call in lazy_vacuum_heap_rel()'s blkno-wise loop. That gives us parity between the first and second heap pass, which seems like a clear maintainability win -- everybody can pass the already-pinned/already-setup vmbuffer by value. -- Peter Geoghegan
v1-0001-Avoid-special-XIDs-in-snapshotConflictHorizon-val.patch
Description: Binary data
v1-0002-Never-just-set-the-all-frozen-bit-in-VM.patch
Description: Binary data