On Mon, Apr 12, 2021 at 9:19 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > So I think we have to stick with the current basic design, and just > tighten things up to make sure that visibility pins are accounted for > in the places that are missing it. > > Hence, I propose the attached. It passes check-world, but that proves > absolutely nothing of course :-(. I wonder if there is any way to > exercise these code paths deterministically.
This approach seems reasonable to me. At least you've managed to structure the visibility map page pin check as concomitant with the existing space recheck. > (I have realized BTW that I was exceedingly fortunate to reproduce > the buildfarm report here --- I have run hundreds of additional > cycles of the same test scenario without getting a second failure.) In the past I've had luck with RR's chaos mode (most notably with the Jepsen SSI bug). That didn't work for me here, though I might just have not persisted with it for long enough. I should probably come up with a shell script that runs the same thing hundreds of times or more in chaos mode, while making sure that useless recordings don't accumulate. The feature is described here: https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mode.html You only have to be lucky once. Once that happens, you're left with a recording to review and re-review at your leisure. This includes all Postgres backends, maybe even pg_regress and other scaffolding (if that's what you're after). But that's for debugging, not testing. The only way that we'll ever be able to test stuff like this is with something like Alexander Korotkov's stop events patch [1]. That infrastructure should be added sooner rather than later. [1] https://postgr.es/m/capphfdtseohx8dsk9qp+z++i4bgqoffkip6jdwngea+g7z-...@mail.gmail.com -- Peter Geoghegan