On Thu, Aug 30, 2018 at 04:03:43PM +0200, Alexander Kukushkin wrote: > 2018-08-30 15:39 GMT+02:00 Michael Paquier <mich...@paquier.xyz>: >> Does it take care of the problem? > > Yep, with the patch applied bgwriter acts as expected!
Thanks for double-checking. I have been struggling for a couple of hours to get a deterministic test case out of my pocket, and I did not get one as you would need to get the bgwriter to flush a page before crash recovery finishes, we could do that easily with a dedicated bgworker, now pg_ctl start expects the standby to finish recovery before, which makes it harder to trigger all the time, particularly for slow machines . Anyway, I did more code review and I think that I found another issue with XLogNeedsFlush(), which could enforce updateMinRecoveryPoint to false if called before XLogFlush during crash recovery from another process than the startup process, so if it got called before XLogFlush() we'd still have the same issue for a process doing both operations. Hence, I have come up with the attached, which actually brings back the code to what it was before 8d68ee6 for those routines, except that we have fast-exit paths for the startup process so as it is still able to replay all WAL available and avoid page reference issues post-promotion, deciding when to update its own copy of minRecoveryPoint when it finishes crash recovery. This also saves from a couple of locks on the control file from the startup process. If you apply the patch and try it on your standby, are you able to get things up and working? -- Michael
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 493f1db7b9..65db2e48d8 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -2723,9 +2723,13 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) * i.e., we're doing crash recovery. We never modify the control file's * value in that case, so we can short-circuit future checks here too. The * local values of minRecoveryPoint and minRecoveryPointTLI should not be - * updated until crash recovery finishes. + * updated until crash recovery finishes. We only do this for the startup + * process as it should not update its own reference of minRecoveryPoint + * until it has finished crash recovery to make sure that all WAL + * available is replayed in this case. This also saves from extra locks + * taken on the control file from the startup process. */ - if (XLogRecPtrIsInvalid(minRecoveryPoint)) + if (XLogRecPtrIsInvalid(minRecoveryPoint) && InRecovery) { updateMinRecoveryPoint = false; return; @@ -2737,7 +2741,9 @@ UpdateMinRecoveryPoint(XLogRecPtr lsn, bool force) minRecoveryPoint = ControlFile->minRecoveryPoint; minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; - if (force || minRecoveryPoint < lsn) + if (XLogRecPtrIsInvalid(minRecoveryPoint)) + updateMinRecoveryPoint = false; + else if (force || minRecoveryPoint < lsn) { XLogRecPtr newMinRecoveryPoint; TimeLineID newMinRecoveryPointTLI; @@ -3127,9 +3133,11 @@ XLogNeedsFlush(XLogRecPtr record) * An invalid minRecoveryPoint means that we need to recover all the * WAL, i.e., we're doing crash recovery. We never modify the control * file's value in that case, so we can short-circuit future checks - * here too. + * here too. This triggers a quick exit path for the startup process, + * which cannot update its local copy of minRecoveryPoint as long as + * it has not replayed all WAL available when doing crash recovery. */ - if (XLogRecPtrIsInvalid(minRecoveryPoint)) + if (XLogRecPtrIsInvalid(minRecoveryPoint) && InRecovery) updateMinRecoveryPoint = false; /* Quick exit if already known to be updated or cannot be updated */ @@ -3146,8 +3154,19 @@ XLogNeedsFlush(XLogRecPtr record) minRecoveryPointTLI = ControlFile->minRecoveryPointTLI; LWLockRelease(ControlFileLock); + /* + * Check minRecoveryPoint for any other process than the startup + * process doing crash recovery, which should not update the control + * file value if crash recovery is still running. + */ + if (XLogRecPtrIsInvalid(minRecoveryPoint)) + updateMinRecoveryPoint = false; + /* check again */ - return record > minRecoveryPoint; + if (record <= minRecoveryPoint || !updateMinRecoveryPoint) + return false; + else + return true; } /* Quick exit if already known flushed */
signature.asc
Description: PGP signature