On Wed, 2025-05-07 at 12:51 +0200, Luca Ferrari wrote: > running 17.4 on ubuntu 24.04 machines. I've three hosts, pg-1 > (primary) and two physical replicas. > I then promote host pg-3 as a master (pg_promote()) and want to rewind > the pg-1 to follow the new master, so: > > ssh pg-3 'sudo -u postgres /usr/lib/postgresql/17/bin/pg_rewind -D > /var/lib/postgresql/17/main --source-server="user=replica_fluca > host=pg-3 dbname=replica_fluca"' > pg_rewind: servers diverged at WAL location 0/B8550F8 on timeline 1 > pg_rewind: error: could not open file > "/var/lib/postgresql/17/main/pg_wal/00000001000000000000000A": No such > file or directory > pg_rewind: error: could not find previous WAL record at 0/AFFF4E8 > > But the file 0x010000A is not there: > > > % ssh pg-3 'sudo ls /var/lib/postgresql/17/main/pg_wal' > 00000001000000000000000B.partial > 00000002.history > 00000002000000000000000B > 00000002000000000000000C > 00000002000000000000000D > 00000002000000000000000E > archive_status > summaries > > % ssh pg-1 'sudo ls /var/lib/postgresql/17/main/pg_wal' > 000000010000000000000005.00000028.backup > 00000001000000000000000B > 00000001000000000000000C > 00000001000000000000000D > 00000001000000000000000E > archive_status > summaries > > Do i have to ensure the old primary pg-1 does a wal switch before > promoting the other one and try to rewind?
I don't think it is connected to a WAL switch. I'd say that you should set "wal_keep_size" high enough that all the WAL needed for pg_rewind is still present. If you have a WAL archive, you could define a restore_command on the server you want to rewind. Yours, Laurenz Albe