On Tue, Jan 17, 2023 at 07:44:52PM +0530, Bharath Rupireddy wrote: > On Thu, Jan 12, 2023 at 6:21 AM Nathan Bossart <nathandboss...@gmail.com> > wrote: >> With your patch, we might replay one of these "old" files in pg_wal instead >> of the complete version of the file from the archives, > > That's true even today, without the patch, no? We're not changing the > existing behaviour of the state machine. Can you explain how it > happens with the patch?
My point is that on HEAD, we will always prefer a complete archive file. With your patch, we might instead choose to replay an old file in pg_wal because we are artificially advancing the state machine. IOW even if there's a complete archive available, we might not use it. This is a behavior change, but I think it is okay. >> Would you mind testing this scenario? > > How about something like below for testing the above scenario? If it > looks okay, I can add it as a new TAP test file. > > 1. Generate WAL files f1 and f2 and archive them. > 2. Check the replay lsn and WAL file name on the standby, when it > replays upto f2, stop the standby. > 3. Set recovery to fail on the standby, and stop the standby. > 4. Generate f3, f4 (partially filled) on the primary. > 5. Manually copy f3, f4 to the standby's pg_wal. > 6. Start the standby, since recovery is set to fail, and there're new > WAL files (f3, f4) under its pg_wal, it must replay those WAL files > (check the replay lsn and WAL file name, it must be f4) before > switching to streaming. > 7. Generate f5 on the primary. > 8. The standby should receive f5 and replay it (check the replay lsn > and WAL file name, it must be f5). > 9. Set streaming to fail on the standby and set recovery to succeed. > 10. Generate f6 on the primary. > 11. The standby should receive f6 via archive and replay it (check the > replay lsn and WAL file name, it must be f6). I meant testing the scenario where there's an old file in pg_wal, a complete file in the archives, and your new GUC forces replay of the former. This might be difficult to do in a TAP test. Ultimately, I just want to validate the assumptions discussed above. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com