At Tue, 19 Apr 2022 13:37:59 -0700, Nathan Bossart <nathandboss...@gmail.com> wrote in > On Mon, Apr 18, 2022 at 04:44:03PM -0400, Robert Haas wrote: > > Here is a new version of the patch which, unlike v1, I think is > > something we could seriously consider applying (not before v16, of > > course). It now removes CHECKPOINT_END_OF_RECOVERY completely, and I > > attach a second patch as well which nukes checkPoint.PrevTimeLineID as > > well. > > I'd like to add a big +1 for this change. IIUC this should help with some > of the problems I've noted elsewhere [0].
Agreed. > > I mentioned two problems with $SUBJECT in the first email with this > > subject line. One was a bug, which Noah has since fixed (thanks, > > Noah). The other problem is that LogStandbySnapshot() and a bunch of > > its friends expect latestCompletedXid to always be a normal XID, which > > is a problem because (1) we currently set nextXid to 3 and (2) at > > startup, we compute latestCompletedXid = nextXid - 1. The current code > > dodges this kind of accidentally: the checkpoint that happens at > > startup is a "shutdown checkpoint" and thus skips logging a standby > > snapshot, since a shutdown checkpoint is a sure indicator that there > > are no running transactions. With the changes, the checkpoint at > > startup happens after we've started allowing write transactions, and > > thus a standby snapshot needs to be logged also. In the cases where > > the end-of-recovery record was already being used, the problem could > > have happened already, except for the fact that those cases involve a > > standby promotion, which doesn't happen during initdb. I explored a > > few possible ways of solving this problem. > > Shouldn't latestCompletedXid be set to MaxTransactionId in this case? Or > is this related to the logic in FullTransactionIdRetreat() that avoids > skipping over the "actual" special transaction IDs? As the result FullTransactionIdRetreat(FirstNormalFullTransactionId) results in FrozenTransactionId, which looks odd. It seems to me rather should be InvalidFullTransactionId, or simply should assert-out that case. But incrmenting the very first xid avoid all that complexity. It is somewhat hacky but very smiple and understandable. > > So ... I decided that the best approach here seems to be to just skip > > FirstNormalTransactionId and use FirstNormalTransactionId + 1 for the > > first write transaction that the cluster ever processes. That's very > > simple and doesn't seem likely to break anything else. On the downside > > it seems a bit grotty, but I don't see anything better, and on the > > whole, I think with this approach we come out substantially ahead. > > 0001 removes 3 times as many lines as it inserts, and 0002 saves a few > > more lines of code. > > This doesn't seem all that bad to me. It's a little hacky, but it's very > easy to understand and only happens once per initdb. I don't think it's > worth any extra complexity. +1. > > Now, I still don't really know that there isn't some theoretical > > difficulty here that makes this whole approach a non-starter, but I > > also can't think of what it might be. If the promotion case, which has > > used the end-of-recovery record for many years, is basically safe, > > despite the fact that it switches TLIs, then it seems to me that the > > crash recovery case, which doesn't have that complication, ought to be > > safe too. But I might well be missing something, so if you see a > > problem, please speak up! > > Your reasoning seems sound to me. > > [0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com FWIW, I don't find a flaw in the reasoning, too. By the way do we need to leave CheckPoint.PrevTimeLineID? It is always the same value with ThisTimeLineID. The most dubious part is ApplyWalRecord but XLOG_CHECKPOINT_SHUTDOWN no longer induces timeline switch. regards. -- Kyotaro Horiguchi NTT Open Source Software Center