On 2013-04-02 12:10:12 +0200, Andres Freund wrote: > On 2013-04-01 08:49:16 +0100, Simon Riggs wrote: > > On 30 March 2013 17:21, Andres Freund <and...@2ndquadrant.com> wrote: > > > > > So if the xid is later than latestObservedXid we extend subtrans one by > > > one. So far so good. But we initialize it in > > > ProcArrayApplyRecoveryInfo() when consistency is initially reached: > > > latestObservedXid = running->nextXid; > > > TransactionIdRetreat(latestObservedXid); > > > Before that subtrans has initially been started up with: > > > if (wasShutdown) > > > oldestActiveXID = > > > PrescanPreparedTransactions(&xids, &nxids); > > > else > > > oldestActiveXID = > > > checkPoint.oldestActiveXid; > > > ... > > > StartupSUBTRANS(oldestActiveXID); > > > > > > That means its only initialized up to checkPoint.oldestActiveXid. As it > > > can take some time till we reach consistency it seems rather plausible > > > that there now will be a gap in initilized pages. From > > > checkPoint.oldestActiveXid to running->nextXid if there are pages > > > inbetween. > > > > That was an old bug. > > > > StartupSUBTRANS() now explicitly fills that gap. Are you saying it > > does that incorrectly? How? > > Well, no. I think StartupSUBTRANS does this correctly, but there's a gap > between the call to Startup* and the first call to ExtendSUBTRANS. The > latter is only called *after* we reached STANDBY_INITIALIZED via > ProcArrayApplyRecoveryInfo(). The problem is that we StartupSUBTRANS to > checkPoint.oldestActiveXid while we start to ExtendSUBTRANS from > running->nextXid - 1. There very well can be a gap inbetween. > The window isn't terribly big but if you use subtransactions as heavily > as Sergey seems to be it doesn't seem unlikely to hit it. > > Let me come up with a testcase and patch.
Developing a testcase was trivial, pgbench running the following function: CREATE OR REPLACE FUNCTION recurse_and_assign_txid(level bigint DEFAULT 0) RETURNS bigint LANGUAGE plpgsql AS $b$ BEGIN IF level < 500 THEN RETURN recurse_and_assign_txid(level + 1); ELSE -- assign xid in subtxn and parents CREATE TEMPORARY TABLE foo(); DROP TABLE foo; RETURN txid_current()::bigint; END IF; EXCEPTION WHEN others THEN RAISE NOTICE 'unexpected'; END $b$; When now restarting a standby (so it restarts from another checkpoint) it frequently crashed with various errors: * pg_subtrans/xxx does not exist * (warning) pg_subtrans page does not exist, assuming zero * xid overwritten in SubTransSetParent So I think my theory is correct. The attached patch fixes this although I don't like the way it knowledge of the point up to which StartupSUBTRANS zeroes pages is handled. Makes sense? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>From cbb4e4d30ba1df5a3d370226545efe4966719867 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 2 Apr 2013 20:20:00 +0200 Subject: [PATCH] Ensure that SUBTRANS is initalized gaplessly when starting up HS --- src/backend/access/transam/xlog.c | 3 ++ src/backend/storage/ipc/procarray.c | 53 +++++++++++++++++++++++++++++++---- src/include/storage/procarray.h | 1 + 3 files changed, 51 insertions(+), 6 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 07c68ad..4a9462e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -5391,6 +5391,9 @@ StartupXLOG(void) oldestActiveXID = checkPoint.oldestActiveXid; Assert(TransactionIdIsValid(oldestActiveXID)); + /* Tell procarray about the range of xids it has to deal with */ + ProcArrayInitRecovery(ShmemVariableCache->nextXid); + /* * Startup commit log and subtrans only. Other SLRUs are not * maintained during recovery and need not be started yet. diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 4308128..53a5aa2 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -470,6 +470,28 @@ ProcArrayClearTransaction(PGPROC *proc) } /* + * ProcArrayInitRecovery -- initialize recovery xid mgmt environment + * + * Remember up to where the startup process initialized the CLOG and subtrans + * so we can ensure its initialized gaplessly up to the point where necessary + * while in recovery. + */ +void +ProcArrayInitRecovery(TransactionId initializedUptoXID) +{ + Assert(standbyState == STANDBY_INITIALIZED); + Assert(TransactionIdIsNormal(initializedUptoXID)); + + /* + * we set latestObservedXid to the xid SUBTRANS has been initialized upto + * so we can extend it from that point onwards when we reach a consistent + * state in ProcArrayApplyRecoveryInfo(). + */ + latestObservedXid = initializedUptoXID; + TransactionIdRetreat(latestObservedXid); +} + +/* * ProcArrayApplyRecoveryInfo -- apply recovery info about xids * * Takes us through 3 states: Initialized, Pending and Ready. @@ -564,7 +586,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running) Assert(standbyState == STANDBY_INITIALIZED); /* - * OK, we need to initialise from the RunningTransactionsData record + * OK, we need to initialise from the RunningTransactionsData record. + * + * NB: this can be reached at least twice, so make sure new code can deal + * with that. */ /* @@ -636,20 +661,32 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running) pfree(xids); /* + * latestObservedXid is set to the the point where SUBTRANS was started up + * to, initialize subtrans from thereon, up to nextXid - 1. + */ + Assert(TransactionIdIsNormal(latestObservedXid)); + while (TransactionIdPrecedes(latestObservedXid, running->nextXid)) + { + ExtendCLOG(latestObservedXid); + ExtendSUBTRANS(latestObservedXid); + + TransactionIdAdvance(latestObservedXid); + } + + /* ---------- * Now we've got the running xids we need to set the global values that * are used to track snapshots as they evolve further. * - * - latestCompletedXid which will be the xmax for snapshots - - * lastOverflowedXid which shows whether snapshots overflow - nextXid + * - latestCompletedXid which will be the xmax for snapshots + * - lastOverflowedXid which shows whether snapshots overflow + * - nextXid * * If the snapshot overflowed, then we still initialise with what we know, * but the recovery snapshot isn't fully valid yet because we know there * are some subxids missing. We don't know the specific subxids that are * missing, so conservatively assume the last one is latestObservedXid. + * ---------- */ - latestObservedXid = running->nextXid; - TransactionIdRetreat(latestObservedXid); - if (running->subxid_overflow) { standbyState = STANDBY_SNAPSHOT_PENDING; @@ -719,6 +756,10 @@ ProcArrayApplyXidAssignment(TransactionId topxid, Assert(standbyState >= STANDBY_INITIALIZED); + /* can't do anything useful unless we have more state setup */ + if (standbyState == STANDBY_INITIALIZED) + return; + max_xid = TransactionIdLatest(topxid, nsubxids, subxids); /* diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h index d5fdfea..c5f58b4 100644 --- a/src/include/storage/procarray.h +++ b/src/include/storage/procarray.h @@ -26,6 +26,7 @@ extern void ProcArrayRemove(PGPROC *proc, TransactionId latestXid); extern void ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid); extern void ProcArrayClearTransaction(PGPROC *proc); +extern void ProcArrayInitRecovery(TransactionId initializedUptoXID); extern void ProcArrayApplyRecoveryInfo(RunningTransactions running); extern void ProcArrayApplyXidAssignment(TransactionId topxid, int nsubxids, TransactionId *subxids); -- 1.7.10.4
-- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs