At Fri, 01 Apr 2022 14:44:30 +0900 (JST), Kyotaro Horiguchi <horikyota....@gmail.com> wrote in > At Sun, 13 Feb 2022 17:35:38 +0300, Yura Sokolov <y.soko...@postgrespro.ru> > wrote in > > And there are checks for `takenDuringRecovery` in `heapgetpage` and > > `heapam_scan_sample_next_tuple`. Are this checks affected by the change? > > Neither the preceding discussion nor commit message answer me. > > The snapshot works correctly, but for the heapgetpage case, it foces > all_visible to be false. That unnecessarily prevents visibility check > from skipping. > > An annoying thing in SnapBuildInitialSnapshot is that we don't know > the number of xids before looping over the xid range, and we don't > want to bother sorting top-level xids and subxids unless we have to do > so. > > Is it better that we hassle in SnapBuildInitialSnapshot to create a > !takenDuringRecovery snapshot?
So this is that. v5 creates a regular snapshot. By the way, is there any chance this could be committed to 15? Otherwise I'll immediately move this to the next CF. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
>From 6f3cfb58bc2c6294124831d6b410b0e51d067fb2 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota....@gmail.com> Date: Mon, 31 Jan 2022 15:03:33 +0900 Subject: [PATCH v5] Create correct snapshot during CREATE_REPLICATION_SLOT Snapshot can hold top XIDs up to the number of server processes but SnapBuildInitialSnapshot tries to store all top-level and sub XIDs to there and easily fails with "initial slot snapshot too large" error. We could instead create a "takenDuringRecovery" snapshot, which later leads to unnecessary visibility checks. Therefore we take trouble to create a regular snapshot by identifying whether each xids is top-level and storing it in the appropriate xid array. Author: Dilip Kumar and Kyotaro Horiguchi --- .../replication/logical/reorderbuffer.c | 20 +++++++++ src/backend/replication/logical/snapbuild.c | 45 +++++++++++++++---- src/include/replication/reorderbuffer.h | 1 + 3 files changed, 58 insertions(+), 8 deletions(-) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index c2d9be81fa..42adbe5f15 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -3658,6 +3658,26 @@ ReorderBufferXidHasBaseSnapshot(ReorderBuffer *rb, TransactionId xid) } +/* + * ReorderBufferXidIsKnownSubXact + * Returns true if the xid is a known subtransaction. + */ +bool +ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid) +{ + ReorderBufferTXN *txn; + + txn = ReorderBufferTXNByXid(rb, xid, false, + NULL, InvalidXLogRecPtr, false); + + /* a known subtxn? */ + if (txn && rbtxn_is_known_subxact(txn)) + return true; + + return false; +} + + /* * --------------------------------------- * Disk serialization support diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index 83fca8a77d..fa08e406e3 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -532,7 +532,10 @@ SnapBuildInitialSnapshot(SnapBuild *builder) Snapshot snap; TransactionId xid; TransactionId *newxip; - int newxcnt = 0; + TransactionId *newsubxip; + int newxcnt; + int newsubxcnt; + bool overflowed = false; Assert(!FirstSnapshotSet); Assert(XactIsoLevel == XACT_REPEATABLE_READ); @@ -568,9 +571,17 @@ SnapBuildInitialSnapshot(SnapBuild *builder) MyProc->xmin = snap->xmin; - /* allocate in transaction context */ + /* + * Allocate in transaction context. + * + * We could use only subxip to store all xids (takenduringrecovery + * snapshot) but that causes useless visibility checks later so we hasle to + * create a normal snapshot. + */ newxip = (TransactionId *) palloc(sizeof(TransactionId) * GetMaxSnapshotXidCount()); + newsubxip = (TransactionId *) + palloc(sizeof(TransactionId) * GetMaxSnapshotSubxidCount()); /* * snapbuild.c builds transactions in an "inverted" manner, which means it @@ -578,6 +589,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder) * classical snapshot by marking all non-committed transactions as * in-progress. This can be expensive. */ + newxcnt = 0; + newsubxcnt = 0; + for (xid = snap->xmin; NormalTransactionIdPrecedes(xid, snap->xmax);) { void *test; @@ -591,12 +605,24 @@ SnapBuildInitialSnapshot(SnapBuild *builder) if (test == NULL) { - if (newxcnt >= GetMaxSnapshotXidCount()) - ereport(ERROR, - (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE), - errmsg("initial slot snapshot too large"))); - - newxip[newxcnt++] = xid; + /* Store the xid to the appropriate xid array */ + if (ReorderBufferXidIsKnownSubXact(builder->reorder, xid)) + { + if (!overflowed) + { + if (newsubxcnt >= GetMaxSnapshotSubxidCount()) + overflowed = true; + else + newsubxip[newsubxcnt++] = xid; + } + } + else + { + if (newxcnt >= GetMaxSnapshotXidCount()) + elog(ERROR, + "too many transactions while creating snapshot"); + newxip[newxcnt++] = xid; + } } TransactionIdAdvance(xid); @@ -606,6 +632,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder) snap->snapshot_type = SNAPSHOT_MVCC; snap->xcnt = newxcnt; snap->xip = newxip; + snap->subxcnt = newsubxcnt; + snap->subxip = newsubxip; + snap->suboverflowed = overflowed; return snap; } diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h index 0bcc150b33..b26c34ce9d 100644 --- a/src/include/replication/reorderbuffer.h +++ b/src/include/replication/reorderbuffer.h @@ -707,6 +707,7 @@ void ReorderBufferXidSetCatalogChanges(ReorderBuffer *, TransactionId xid, XLog bool ReorderBufferXidHasCatalogChanges(ReorderBuffer *, TransactionId xid); bool ReorderBufferXidHasBaseSnapshot(ReorderBuffer *, TransactionId xid); +bool ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid); bool ReorderBufferRememberPrepareInfo(ReorderBuffer *rb, TransactionId xid, XLogRecPtr prepare_lsn, XLogRecPtr end_lsn, TimestampTz prepare_time, -- 2.27.0