Hello Kyotaro,
On Fri, 26 Aug 2022 at 10:04, Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > With archive_mode = always you can't reproduce it. > > It is very rarely people set it to always in production due to the > overhead. > ... > > The archive_mode has to be set to on and the archive_command should be > > failing when you do pg_ctl -D oldprim stop > > Ah, I see. > > What I don't still understand is why pg_rewind doesn't work for the > old primary in that case. When archive_mode=on, the old primary has > the complete set of WAL files counting both pg_wal and its archive. So > as the same to the privious repro, pg_rewind -c ought to work (but it > uses its own archive this time). In that sense the proposed solution > is still not needed in this case. > The pg_rewind finishes successfully. But as a result it removes some files from pg_wal that are required to perform recovery because they are missing on the new primary. > > A bit harder situation comes after the server successfully rewound; if > the new primary goes so far that the old primary cannot connect. Even > in that case, you can copy-in the requried WAL files or configure > restore_command of the old pimary so that it finds required WAL files > there. > Yes, we can do the backup of pg_wal before running pg_rewind, but it feels very ugly, because we will also have to clean this "backup" after a successful recovery. It would be much better if pg_rewind didn't remove WAL files between the last common checkpoint and diverged LSN in the first place. Regards, -- Alexander Kukushkin