On Thu, Aug 10, 2023 at 9:15 PM Christoph Berg <m...@debian.org> wrote: > No XXX lines this time either, but I've seen then im logfiles that > went through successfully.
Hmm. Well, I think this looks like a different kind of bug then. That patch of mine is about fixing some unsafe coding on the receiving side of a signal. In this case it's apparently not being sent. So either the Heap2/PRUNE record was able to proceed (indicating that that CURSOR was not holding a pin as expected), or VACUUM decided not to actually do anything to that block (conditional cleanup lock vs transient pin changing behaviour?), or there's a bug somewhere in/near LockBufferForCleanup(), which should have emitted that XXX message before even calling ResolveRecoveryConflictWithBufferPin(). Do you still have the data directories around from that run, so we can see if the expected Heap2/PRUNE was actually logged? For example (using meson layout here, in the build directory) that'd be something like: $ ./tmp_install/home/tmunro/install/bin/pg_waldump testrun/recovery/031_recovery_conflict/data/t_031_recovery_conflict_standby_data/pgdata/pg_wal/000000010000000000000003 In there I see this: rmgr: Heap2 len (rec/tot): 57/ 57, tx: 0, lsn: 0/0344BB90, prev 0/0344BB68, desc: PRUNE snapshotConflictHorizon: 0, nredirected: 0, ndead: 1, nunused: 0, redirected: [], dead: [21], unused: [], blkref #0: rel 1663/16385/16386 blk 0 That's the WAL record that's supposed to be causing 031_recovery_conflict_standby.log to talk about a conflict, starting with this: 2023-08-10 22:47:04.564 NZST [57145] LOG: recovery still waiting after 10.035 ms: recovery conflict on buffer pin 2023-08-10 22:47:04.564 NZST [57145] CONTEXT: WAL redo at 0/344BB90 for Heap2/PRUNE: snapshotConflictHorizon: 0, nredirected: 0, ndead: 1, nunused: 0, redirected: [], dead: [21], unused: []; blkref #0: rel 1663/16385/16386, blk 0