Hi all, On 2012-11-27 19:52:09 +0000, tar...@gmail.com wrote: > This happens only if wal_level=hot_standby. > > > Here are the steps to reproduce this issue.
Oh. Fucking. Wow. I think this tiny comment just helped me find the bug. After previously having looked for it without success for some time I just recognized the problem while working on something unrelated. Here it goes: void CreateCheckPoint(int flags) { ... /* * Get the other info we need for the checkpoint record. */ LWLockAcquire(XidGenLock, LW_SHARED); checkPoint.nextXid = ShmemVariableCache->nextXid; checkPoint.oldestXid = ShmemVariableCache->oldestXid; checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB; LWLockRelease(XidGenLock); /* Increase XID epoch if we've wrapped around since last checkpoint */ checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch; if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid) checkPoint.nextXidEpoch++; // i.e. compute the epoch based on the *current* nextXid. ... // do all the writing, take some time CheckPointGuts(checkPoint.redo, flags); ... * Update checkPoint.nextXid since we have a later value */ if (!shutdown && XLogStandbyInfoActive()) LogStandbySnapshot(&checkPoint.nextXid); ... /* Update shared-memory copy of checkpoint XID/epoch */ { /* use volatile pointer to prevent code rearrangement */ volatile XLogCtlData *xlogctl = XLogCtl; SpinLockAcquire(&xlogctl->info_lck); xlogctl->ckptXidEpoch = checkPoint.nextXidEpoch; xlogctl->ckptXid = checkPoint.nextXid; SpinLockRelease(&xlogctl->info_lck); } ... } Notice the end of the comment above/about LogStandbySnapshot. It updates nextXid! But it does *not* recheck whether we have had a wraparound + so it does not increase the epoch counter. Although we might have had a wraparound inbetween. Which in turn means the the ShmemVariableCache->nextXid < xlogctl->ckptXid computation in GetNextXidAndEpoch doesn't return correct results anymore because it can't recognize that were in a new epoch now. Trivial patch attached. I am not sure that I understand why its interesting to have a newer ->nextXid in the checkpoint, but I don't see anything that would make it dangerous besides this. This looks like it also explains #6291 and the slighly different issue described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=q...@mail.gmail.com, as the issue is just as present on standbys as it is on primaries. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>From bf1ff10acf7dbfd9a6046c5fcb4bb816485a1e7b Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Sat, 1 Dec 2012 23:51:34 +0100 Subject: [PATCH] Fix xid epoch calculation with wal_level=hot_standby When wal_level=hot_standby CreateCheckPoint also logs running transactions. During the computation of those it recomputes CheckPoint->nextXid. Unfortunately it does so after it has been used to compute CheckPoint->nextXidEpoch. Move code around slightly to fix the issue. This fixes #6291, #7710 and the slighly different issue described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=q...@mail.gmail.com, as the issue is just as present on standbys as it is on primaries. Bug found by Daniel Farina and Tarvi Pillessaar --- src/backend/access/transam/xlog.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 8b19976..4760081 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -7072,11 +7072,6 @@ CreateCheckPoint(int flags) checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB; LWLockRelease(XidGenLock); - /* Increase XID epoch if we've wrapped around since last checkpoint */ - checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch; - if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid) - checkPoint.nextXidEpoch++; - LWLockAcquire(OidGenLock, LW_SHARED); checkPoint.nextOid = ShmemVariableCache->nextOid; if (!shutdown) @@ -7115,6 +7110,14 @@ CreateCheckPoint(int flags) START_CRIT_SECTION(); /* + * Increase XID epoch if we've wrapped around since last checkpoint, do + * this after LogStandbySnapshot which updates nextXid. + */ + checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch; + if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid) + checkPoint.nextXidEpoch++; + + /* * Now insert the checkpoint record into XLOG. */ rdata.data = (char *) (&checkPoint); -- 1.7.12.289.g0ce9864.dirty
-- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs