Hi, On 2022-06-21 17:22:05 +1200, Thomas Munro wrote: > Problem: I saw 031_recovery_conflict.pl time out while waiting for a > buffer pin conflict, but so far once only, on CI: > > https://cirrus-ci.com/task/5956804860444672 > > timed out waiting for match: (?^:User was holding shared buffer pin > for too long) at t/031_recovery_conflict.pl line 367. > > Hrmph. Still trying to reproduce that, which may be a bug in this > patch, a bug in the test or a pre-existing problem. Note that > recovery didn't say something like: > > 2022-06-21 17:05:40.931 NZST [57674] LOG: recovery still waiting > after 11.197 ms: recovery conflict on buffer pin > > (That's what I'd expect to see in > https://api.cirrus-ci.com/v1/artifact/task/5956804860444672/log/src/test/recovery/tmp_check/log/031_recovery_conflict_standby.log > if the startup process had decided to send the signal). > > ... so it seems like the problem in that run is upstream of the interrupt > stuff.
Odd. The only theory I have so far is that the manual vacuum on the primary somehow decided to skip the page, and thus didn't trigger a conflict. Because clearly replay progressed past the records of the VACUUM. Perhaps we should use VACUUM VERBOSE? In contrast to pg_regress tests that should be unproblematic? Greetings, Andres Freund