On Thu, Sep 1, 2022 at 3:08 PM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > At Thu, 1 Sep 2022 12:05:36 +1200, Thomas Munro <thomas.mu...@gmail.com> > wrote in > > On Thu, Sep 1, 2022 at 2:01 AM Justin Pryzby <pry...@telsasoft.com> wrote: > > > < 2022-08-31 08:44:10.495 CDT >LOG: checkpoint starting: > > > end-of-recovery immediate wait > > > < 2022-08-31 08:44:10.609 CDT >LOG: request to flush past end of > > > generated WAL; request 1201/1CAF84F0, current position 1201/1CADB730 > > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation > > > base/16881/2840_vm > > > < 2022-08-31 08:44:10.609 CDT >ERROR: xlog flush request 1201/1CAF84F0 > > > is not satisfied --- flushed only to 1201/1CADB730 > > > < 2022-08-31 08:44:10.609 CDT >CONTEXT: writing block 0 of relation > > > base/16881/2840_vm > > > < 2022-08-31 08:44:10.609 CDT >FATAL: checkpoint request failed > > > > > > I was able to start it with -c recovery_prefetch=no, so it seems like > > > prefetch tried to do too much. The VM runs centos7 under qemu. > > > I'm making a copy of the data dir in cases it's needed. > > Just for information, there was a fixed bug about > overwrite-aborted-contrecord feature, which causes this kind of > failure (xlog flush request exceeds insertion bleeding edge). If it is > that, it has been fixed by 6672d79139 two-days ago.
Hmm. Justin, when you built from source, which commit were you at? If it's REL_15_BETA3, any chance you could cherry pick that change and check what happens? And without that, could you show what this logs for good and bad recovery settings?
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 248a40e8fa..1e435e81d5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5308,6 +5308,7 @@ StartupXLOG(void) */ endOfRecoveryInfo = FinishWalRecovery(); EndOfLog = endOfRecoveryInfo->endOfLog; +elog(LOG, "XXX point 1: EndOfLog = %lx", EndOfLog); EndOfLogTLI = endOfRecoveryInfo->endOfLogTLI; abortedRecPtr = endOfRecoveryInfo->abortedRecPtr; missingContrecPtr = endOfRecoveryInfo->missingContrecPtr; @@ -5446,7 +5447,9 @@ StartupXLOG(void) { Assert(!XLogRecPtrIsInvalid(abortedRecPtr)); EndOfLog = missingContrecPtr; +elog(LOG, "XXX clobbering EndOfLog"); } +elog(LOG, "XXX point 2: EndOfLog = %lx", EndOfLog); /* * Prepare to write WAL starting at EndOfLog location, and init xlog @@ -5456,6 +5459,7 @@ StartupXLOG(void) Insert = &XLogCtl->Insert; Insert->PrevBytePos = XLogRecPtrToBytePos(endOfRecoveryInfo->lastRec); Insert->CurrBytePos = XLogRecPtrToBytePos(EndOfLog); +elog(LOG, "XXX point 3: Insert->CurrBytePos = %lx", Insert->CurrBytePos); /* * Tricky point here: lastPage contains the *last* block that the LastRec @@ -5554,6 +5558,7 @@ StartupXLOG(void) Insert->fullPageWrites = lastFullPageWrites; UpdateFullPageWrites(); +elog(LOG, "XXX point 4: Insert->CurrBytePos = %lx", Insert->CurrBytePos); /* * Emit checkpoint or end-of-recovery record in XLOG, if required. */ diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 141af04388..8c6753a8a8 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -1434,6 +1434,8 @@ FinishWalRecovery(void) XLogPrefetcherBeginRead(xlogprefetcher, lastRec); (void) ReadRecord(xlogprefetcher, PANIC, false, lastRecTLI); endOfLog = xlogreader->EndRecPtr; +elog(LOG, "point 0: lastRec = %lx", lastRec); +elog(LOG, "point 0: endOfLog = %lx", endOfLog); /* * Remember the TLI in the filename of the XLOG segment containing the