On Wed, Nov 23, 2022 at 2:42 PM Andres Freund <and...@anarazel.de> wrote: > The failure has to be happening in wait_for_postmaster_promote(), because the > standby2 is actually successfully promoted.
I assume this is ext4. Presumably anything that reads the controlfile, like pg_ctl, pg_checksums, pg_resetwal, pg_control_system(), ... by reading without interlocking against writes could see garbage. I have lost track of the versions and the thread, but I worked out at some point by experimentation that this only started relatively recently for concurrent read() and write(), but always happened with concurrent pread() and pwrite(). The control file uses the non-p variants which didn't mash old/new data like grated cheese under concurrency due to some implementation detail, but now does.