On Thu, May 27, 2021 at 12:09 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > At Thu, 27 May 2021 11:44:47 +0530, Dilip Kumar <dilipbal...@gmail.com> wrote > in > > Maybe we can somehow achieve that without a broken archive command, > > but I am not sure how it is enough to just delete WAL from pg_wal? I > > mean my original case was that > > 1. Got the new history file from the archive but did not get the WAL > > file yet which contains the checkpoint after TL switch > > 2. So the standby2 try to stream using new primary using old TL and > > set the wrong TL in expectedTLEs > > > > But if you are not doing anything to stop archiving WAL files or to > > guarantee that WAL has come to archive and you deleted those then I am > > not sure how we are reproducing the original problem. > > Thanks for the reply! > > We're writing at the very beginning of the switching segment at the > promotion time. So it is guaranteed that the first segment of the > newer timline won't be archived until the rest almost 16MB in the > segment is consumed or someone explicitly causes a segment switch > (including archive timeout).
I agree > > BTW, I have also tested your script and I found below log, which shows > > that standby2 is successfully able to select the timeline2 so it is > > not reproducing the issue. Am I missing something? > > standby_2? My last one 026_timeline_issue_2.pl doesn't use that name > and uses "standby_1 and "cascade". In the ealier ones, standby_4 and > 5 (or 3 and 4 in the later versions) are used in ths additional tests. > > So I think it shold be something different? Yeah, I tested with your patch where you had a different test case, with "026_timeline_issue_2.pl", I am able to reproduce the issue. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com