At Fri, 01 Apr 2022 14:44:30 +0900 (JST), Kyotaro Horiguchi 
<horikyota....@gmail.com> wrote in 
> At Sun, 13 Feb 2022 17:35:38 +0300, Yura Sokolov <y.soko...@postgrespro.ru> 
> wrote in 
> > And there are checks for `takenDuringRecovery` in `heapgetpage` and
> > `heapam_scan_sample_next_tuple`. Are this checks affected by the change?
> > Neither the preceding discussion nor commit message answer me.
> 
> The snapshot works correctly, but for the heapgetpage case, it foces
> all_visible to be false.  That unnecessarily prevents visibility check
> from skipping.
> 
> An annoying thing in SnapBuildInitialSnapshot is that we don't know
> the number of xids before looping over the xid range, and we don't
> want to bother sorting top-level xids and subxids unless we have to do
> so.
> 
> Is it better that we hassle in SnapBuildInitialSnapshot to create a
> !takenDuringRecovery snapshot?

So this is that. v5 creates a regular snapshot.

By the way, is there any chance this could be committed to 15?
Otherwise I'll immediately move this to the next CF.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 6f3cfb58bc2c6294124831d6b410b0e51d067fb2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota....@gmail.com>
Date: Mon, 31 Jan 2022 15:03:33 +0900
Subject: [PATCH v5] Create correct snapshot during CREATE_REPLICATION_SLOT

Snapshot can hold top XIDs up to the number of server processes but
SnapBuildInitialSnapshot tries to store all top-level and sub XIDs to
there and easily fails with "initial slot snapshot too large" error.

We could instead create a "takenDuringRecovery" snapshot, which later
leads to unnecessary visibility checks. Therefore we take trouble to
create a regular snapshot by identifying whether each xids is
top-level and storing it in the appropriate xid array.

Author: Dilip Kumar and Kyotaro Horiguchi
---
 .../replication/logical/reorderbuffer.c       | 20 +++++++++
 src/backend/replication/logical/snapbuild.c   | 45 +++++++++++++++----
 src/include/replication/reorderbuffer.h       |  1 +
 3 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index c2d9be81fa..42adbe5f15 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -3658,6 +3658,26 @@ ReorderBufferXidHasBaseSnapshot(ReorderBuffer *rb, TransactionId xid)
 }
 
 
+/*
+ * ReorderBufferXidIsKnownSubXact
+ *		Returns true if the xid is a known subtransaction.
+ */
+bool
+ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid)
+{
+	ReorderBufferTXN *txn;
+
+	txn = ReorderBufferTXNByXid(rb, xid, false,
+								NULL, InvalidXLogRecPtr, false);
+
+	/* a known subtxn? */
+	if (txn && rbtxn_is_known_subxact(txn))
+		return true;
+
+	return false;
+}
+
+
 /*
  * ---------------------------------------
  * Disk serialization support
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 83fca8a77d..fa08e406e3 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -532,7 +532,10 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
 	Snapshot	snap;
 	TransactionId xid;
 	TransactionId *newxip;
-	int			newxcnt = 0;
+	TransactionId *newsubxip;
+	int			newxcnt;
+	int			newsubxcnt;
+	bool		overflowed = false;
 
 	Assert(!FirstSnapshotSet);
 	Assert(XactIsoLevel == XACT_REPEATABLE_READ);
@@ -568,9 +571,17 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
 
 	MyProc->xmin = snap->xmin;
 
-	/* allocate in transaction context */
+	/*
+	 * Allocate in transaction context.	
+	 *
+	 * We could use only subxip to store all xids (takenduringrecovery
+	 * snapshot) but that causes useless visibility checks later so we hasle to
+	 * create a normal snapshot.
+	 */
 	newxip = (TransactionId *)
 		palloc(sizeof(TransactionId) * GetMaxSnapshotXidCount());
+	newsubxip = (TransactionId *)
+		palloc(sizeof(TransactionId) * GetMaxSnapshotSubxidCount());
 
 	/*
 	 * snapbuild.c builds transactions in an "inverted" manner, which means it
@@ -578,6 +589,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
 	 * classical snapshot by marking all non-committed transactions as
 	 * in-progress. This can be expensive.
 	 */
+	newxcnt = 0;
+	newsubxcnt = 0;
+
 	for (xid = snap->xmin; NormalTransactionIdPrecedes(xid, snap->xmax);)
 	{
 		void	   *test;
@@ -591,12 +605,24 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
 
 		if (test == NULL)
 		{
-			if (newxcnt >= GetMaxSnapshotXidCount())
-				ereport(ERROR,
-						(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
-						 errmsg("initial slot snapshot too large")));
-
-			newxip[newxcnt++] = xid;
+			/* Store the xid to the appropriate xid array */
+			if (ReorderBufferXidIsKnownSubXact(builder->reorder, xid))
+			{
+				if (!overflowed)
+				{
+					if (newsubxcnt >= GetMaxSnapshotSubxidCount())
+						overflowed = true;
+					else
+						newsubxip[newsubxcnt++] = xid;
+				}
+			}
+			else
+			{
+				if (newxcnt >= GetMaxSnapshotXidCount())
+					elog(ERROR,
+						 "too many transactions while creating snapshot");
+				newxip[newxcnt++] = xid;
+			}
 		}
 
 		TransactionIdAdvance(xid);
@@ -606,6 +632,9 @@ SnapBuildInitialSnapshot(SnapBuild *builder)
 	snap->snapshot_type = SNAPSHOT_MVCC;
 	snap->xcnt = newxcnt;
 	snap->xip = newxip;
+	snap->subxcnt = newsubxcnt;
+	snap->subxip = newsubxip;
+	snap->suboverflowed = overflowed;
 
 	return snap;
 }
diff --git a/src/include/replication/reorderbuffer.h b/src/include/replication/reorderbuffer.h
index 0bcc150b33..b26c34ce9d 100644
--- a/src/include/replication/reorderbuffer.h
+++ b/src/include/replication/reorderbuffer.h
@@ -707,6 +707,7 @@ void		ReorderBufferXidSetCatalogChanges(ReorderBuffer *, TransactionId xid, XLog
 bool		ReorderBufferXidHasCatalogChanges(ReorderBuffer *, TransactionId xid);
 bool		ReorderBufferXidHasBaseSnapshot(ReorderBuffer *, TransactionId xid);
 
+bool		ReorderBufferXidIsKnownSubXact(ReorderBuffer *rb, TransactionId xid);
 bool		ReorderBufferRememberPrepareInfo(ReorderBuffer *rb, TransactionId xid,
 											 XLogRecPtr prepare_lsn, XLogRecPtr end_lsn,
 											 TimestampTz prepare_time,
-- 
2.27.0

Reply via email to