On Fri, May 21, 2021 at 7:51 AM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > > https://www.postgresql.org/message-id/50E43C57.5050101%40vmware.com > > > That leaves one case not covered: If you take a backup with plain > > "pg_basebackup" from a standby, without -X, and the first WAL segment > > contains a timeline switch (ie. you take the backup right after a > > failover), and you try to recover from it without a WAL archive, it > > doesn't work. This is the original issue that started this thread, > > except that I used "-x" in my original test case. The problem here is > > that even though streaming replication will fetch the timeline history > > file when it connects, at the very beginning of recovery, we expect that > > we already have the timeline history file corresponding the initial > > timeline available, either in pg_xlog or the WAL archive. By the time > > streaming replication has connected and fetched the history file, we've > > already initialized expectedTLEs to contain just the latest TLI. To fix > > that, we should delay calling readTimeLineHistoryFile() until streaming > > replication has connected and fetched the file. > > If the first segment read by recovery contains a timeline switch, the first > > pages have older timeline than segment timeline. However, if > > exepectedTLEs contained only the segment timeline, we cannot know if > > we can use the record. In that case the following error is issued. > > If expectedTLEs is initialized with the pseudo list, > tliOfPointInHistory always return the recoveryTargetTLI regardless of > the given LSN so the checking about timelines later doesn't work. And > later ReadRecord is surprised to see a page of an unknown timeline.
>From this whole discussion (on the thread given by you), IIUC the issue was that if the checkpoint LSN does not exist on the "ControlFile->checkPointCopy.ThisTimeLineID". If that is true then I agree that we will just initialize expectedTLE based on the online entry (ControlFile->checkPointCopy.ThisTimeLineID) and later we will fail to find the checkpoint record on this timeline because the checkpoint LSN is smaller than the start LSN of this timeline. Right? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com