On Wed, Aug 24, 2022 at 4:40 AM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > Me, too. There are two ways to deal with this, I think. One is start > writing new records from abortedContRecPtr as if it were not > exist. Another is copying WAL file up to missingContRecPtr. Since the > first segment of the new timeline doesn't need to be identcal to the > last one of the previous timeline, so I think the former way is > cleaner.
I agree, mostly because that gets us back to the way all of this worked before the contrecord stuff went in. This case wasn't broken then, because the breakage had to do with it being unsafe to back up and rewrite WAL that might have already been shipped someplace, and that's not an issue when we're first creating a totally new timeline. It seems safer to me to go back to the way this worked before the fix went in than to change over to a new system. Honestly, in a vacuum, I might prefer to get rid of this thing where the WAL segment gets copied over from the old timeline to the new, and just always switch TLIs at segment boundaries. And while we're at it, I'd also like TLIs to be 64-bit random numbers instead of integers assigned in ascending order. But those kinds of design changes seem best left for a future master-only development effort. Here, we need to back-patch the fix, and should try to just unbreak what's currently broken. > XLogInitNewTimeline or near seems to be be the place for fix > to me. Clearing abortedRecPtr and missingContrecPtr just before the > call to findNewestTimeLine will work? Hmm, yeah, that seems like a good approach. -- Robert Haas EDB: http://www.enterprisedb.com