Hi,

Robert, Tom, it'd be great if you could look through this thread. I
think there's a problem here (and it has gotten worse after the
introduction of catalog snapshots). Both of you at least dabbled in
related code.


On 2020-02-29 12:17:07 -0800, Andres Freund wrote:
> On 2020-02-28 22:10:52 -0800, Andres Freund wrote:
> > So, um. What happens is that doDeletion() does a catalog scan, which
> > sets a snapshot. The results of that catalog scan are then used to
> > perform modifications. But at that point there's no guarantee that we
> > still hold *any* snapshot, as e.g. invalidations can trigger the catalog
> > snapshot being released.
> > 
> > I don't see how that's safe. Without ->xmin preventing that,
> > intermediate row versions that we did look up could just get vacuumed
> > away, and replaced with a different row. That does seem like a serious
> > issue?
> > 
> > I think there's likely a lot of places that can hit this? What makes it
> > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release
> > ->xmin when there previously has been *any* catalog access? Because in
> > contrast to normal table modifications, there's currently nothing at all
> > forcing us to hold a snapshot between catalog lookups an their
> > modifications?
> > 
> > Am I missing something? Or is this a fairly significant hole in our
> > arrangements?
> 
> I still think that's true.  In a first iteration I hacked around the
> problem by explicitly registering a catalog snapshot in
> RemoveTempRelations(). That *sometimes* allows to get through the
> regression tests without the assertions triggering.

The attached two patches (they're not meant to be applied) reliably get
through the regression tests. But I suspect I'd have to at least do a
CLOBBER_CACHE_ALWAYS run to find all the actually vulnerable places.


> But I don't think that's good enough (even if we fixed the other
> potential crashes similarly). The only reason that avoids the asserts is
> because in nearly all other cases there's also a user snapshot that's
> pushed. But that pushed snapshot can have an xmin that's newer than the
> catalog snapshot, which means we're still in danger of tids from catalog
> scans being outdated.
> 
> My preliminary conclusion is that it's simply not safe to do
> SnapshotResetXmin() from within InvalidateCatalogSnapshot(),
> PopActiveSnapshot(), UnregisterSnapshotFromOwner() etc. Instead we need
> to defer the SnapshotResetXmin() call until at least
> CommitTransactionCommand()? Outside of that there ought (with exception
> of multi-transaction commands, but they have to be careful anyway) to be
> no "in progress" sequences of related catalog lookups/modifications.
> 
> Alternatively we could ensure that all catalog lookup/mod sequences
> ensure that the first catalog snapshot is registered. But that seems
> like a gargantuan task?

I also just noticed comments of this style in a few places
         * Start a transaction so we can access pg_database, and get a snapshot.
         * We don't have a use for the snapshot itself, but we're interested in
         * the secondary effect that it sets RecentGlobalXmin.  (This is 
critical
         * for anything that reads heap pages, because HOT may decide to prune
         * them even if the process doesn't attempt to modify any tuples.)
followed by code like

        StartTransactionCommand();
        (void) GetTransactionSnapshot();

        rel = table_open(DatabaseRelationId, AccessShareLock);
        scan = table_beginscan_catalog(rel, 0, NULL);

which is not safe at all, unfortunately. The snapshot is not
pushed/active, therefore invalidations processed e.g. as part of the
table_open() could execute a InvalidateCatalogSnapshot(), which in turn
would remove the catalog snapshot from the pairing heap and
SnapshotResetXmin().  And poof, the backend's xmin is gone.

Greetings,

Andres Freund
>From 3b58990c088936122f38d855a5a3900602deacf7 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 6 Apr 2020 21:28:55 -0700
Subject: [PATCH 1/2] TMP: work around missing snapshot registrations.

This is just what's hit by the tests. It's not an actual fix.
---
 src/backend/catalog/namespace.c             |  7 +++++++
 src/backend/catalog/pg_subscription.c       |  4 ++++
 src/backend/commands/indexcmds.c            |  9 +++++++++
 src/backend/commands/tablecmds.c            |  8 ++++++++
 src/backend/replication/logical/tablesync.c | 12 ++++++++++++
 src/backend/replication/logical/worker.c    |  4 ++++
 src/backend/utils/time/snapmgr.c            |  4 ++++
 7 files changed, 48 insertions(+)

diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 2ec23016fe5..e4696d8d417 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -55,6 +55,7 @@
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/varlena.h"
 
@@ -4244,12 +4245,18 @@ RemoveTempRelationsCallback(int code, Datum arg)
 {
 	if (OidIsValid(myTempNamespace))	/* should always be true */
 	{
+		Snapshot snap;
+
 		/* Need to ensure we have a usable transaction. */
 		AbortOutOfAnyTransaction();
 		StartTransactionCommand();
 
+		/* ensure xmin stays set */
+		snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
 		RemoveTempRelations(myTempNamespace);
 
+		UnregisterSnapshot(snap);
 		CommitTransactionCommand();
 	}
 }
diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c
index cb157311154..4a324dfb4f1 100644
--- a/src/backend/catalog/pg_subscription.c
+++ b/src/backend/catalog/pg_subscription.c
@@ -31,6 +31,7 @@
 #include "utils/fmgroids.h"
 #include "utils/pg_lsn.h"
 #include "utils/rel.h"
+#include "utils/snapmgr.h"
 #include "utils/syscache.h"
 
 static List *textarray_to_stringlist(ArrayType *textarray);
@@ -286,6 +287,7 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 	bool		nulls[Natts_pg_subscription_rel];
 	Datum		values[Natts_pg_subscription_rel];
 	bool		replaces[Natts_pg_subscription_rel];
+	Snapshot snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
 
 	LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock);
 
@@ -321,6 +323,8 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state,
 
 	/* Cleanup. */
 	table_close(rel, NoLock);
+
+	UnregisterSnapshot(snap);
 }
 
 /*
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index 2baca12c5f4..094bf6139f0 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -2837,6 +2837,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	char	   *relationName = NULL;
 	char	   *relationNamespace = NULL;
 	PGRUsage	ru0;
+	Snapshot	snap;
 
 	/*
 	 * Create a memory context that will survive forced transaction commits we
@@ -3306,6 +3307,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	 */
 
 	StartTransactionCommand();
+	snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
 
 	forboth(lc, indexIds, lc2, newIndexIds)
 	{
@@ -3354,8 +3356,11 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	}
 
 	/* Commit this transaction and make index swaps visible */
+	UnregisterSnapshot(snap);
 	CommitTransactionCommand();
+
 	StartTransactionCommand();
+	snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
 
 	/*
 	 * Phase 5 of REINDEX CONCURRENTLY
@@ -3386,7 +3391,9 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	}
 
 	/* Commit this transaction to make the updates visible. */
+	UnregisterSnapshot(snap);
 	CommitTransactionCommand();
+
 	StartTransactionCommand();
 
 	/*
@@ -3400,6 +3407,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	WaitForLockersMultiple(lockTags, AccessExclusiveLock, true);
 
 	PushActiveSnapshot(GetTransactionSnapshot());
+	snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
 
 	{
 		ObjectAddresses *objects = new_object_addresses();
@@ -3425,6 +3433,7 @@ ReindexRelationConcurrently(Oid relationOid, int options)
 	}
 
 	PopActiveSnapshot();
+	UnregisterSnapshot(snap);
 	CommitTransactionCommand();
 
 	/*
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 6162fb018c7..e1eacc6a4a6 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15200,6 +15200,7 @@ PreCommit_on_commit_actions(void)
 	ListCell   *l;
 	List	   *oids_to_truncate = NIL;
 	List	   *oids_to_drop = NIL;
+	Snapshot	snap;
 
 	foreach(l, on_commits)
 	{
@@ -15231,6 +15232,11 @@ PreCommit_on_commit_actions(void)
 		}
 	}
 
+	if (oids_to_truncate == NIL && oids_to_drop == NIL)
+		return;
+
+	snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
 	/*
 	 * Truncate relations before dropping so that all dependencies between
 	 * relations are removed after they are worked on.  Doing it like this
@@ -15284,6 +15290,8 @@ PreCommit_on_commit_actions(void)
 		}
 #endif
 	}
+
+	UnregisterSnapshot(snap);
 }
 
 /*
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index c27d9705895..aec5a044790 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -863,6 +863,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 			{
 				Relation	rel;
 				WalRcvExecResult *res;
+				Snapshot	snap;
 
 				SpinLockAcquire(&MyLogicalRepWorker->relmutex);
 				MyLogicalRepWorker->relstate = SUBREL_STATE_DATASYNC;
@@ -871,10 +872,14 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 
 				/* Update the state and make it visible to others. */
 				StartTransactionCommand();
+				snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
 				UpdateSubscriptionRelState(MyLogicalRepWorker->subid,
 										   MyLogicalRepWorker->relid,
 										   MyLogicalRepWorker->relstate,
 										   MyLogicalRepWorker->relstate_lsn);
+
+				UnregisterSnapshot(snap);
 				CommitTransactionCommand();
 				pgstat_report_stat(false);
 
@@ -918,6 +923,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 								   CRS_USE_SNAPSHOT, origin_startpos);
 
 				PushActiveSnapshot(GetTransactionSnapshot());
+				snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
 				copy_table(rel);
 				PopActiveSnapshot();
 
@@ -933,6 +939,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 				/* Make the copy visible. */
 				CommandCounterIncrement();
 
+				UnregisterSnapshot(snap);
+
 				/*
 				 * We are done with the initial data synchronization, update
 				 * the state.
@@ -957,6 +965,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 				 */
 				if (*origin_startpos >= MyLogicalRepWorker->relstate_lsn)
 				{
+					snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid));
+
 					/*
 					 * Update the new state in catalog.  No need to bother
 					 * with the shmem state as we are exiting for good.
@@ -965,6 +975,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 											   MyLogicalRepWorker->relid,
 											   SUBREL_STATE_SYNCDONE,
 											   *origin_startpos);
+					UnregisterSnapshot(snap);
+
 					finish_sync_worker();
 				}
 				break;
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index a752a1224d6..f10f3f843d1 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -1245,6 +1245,9 @@ apply_handle_truncate(StringInfo s)
 
 	ensure_transaction();
 
+	/* catalog modifications need a set snapshot */
+	PushActiveSnapshot(GetTransactionSnapshot());
+
 	remote_relids = logicalrep_read_truncate(s, &cascade, &restart_seqs);
 
 	foreach(lc, remote_relids)
@@ -1332,6 +1335,7 @@ apply_handle_truncate(StringInfo s)
 	}
 
 	CommandCounterIncrement();
+	PopActiveSnapshot();
 }
 
 
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 1c063c592ce..b5cff157bf6 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -441,6 +441,8 @@ GetOldestSnapshot(void)
 Snapshot
 GetCatalogSnapshot(Oid relid)
 {
+	Assert(IsTransactionState());
+
 	/*
 	 * Return historic snapshot while we're doing logical decoding, so we can
 	 * see the appropriate state of the catalog.
@@ -1017,6 +1019,8 @@ SnapshotResetXmin(void)
 	if (pairingheap_is_empty(&RegisteredSnapshots))
 	{
 		MyPgXact->xmin = InvalidTransactionId;
+		TransactionXmin = InvalidTransactionId;
+		RecentXmin = InvalidTransactionId;
 		return;
 	}
 
-- 
2.25.0.114.g5b0ca878e0

>From 076c589dff7e08f0a6b562b185f179da4fbfc13a Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Mon, 6 Apr 2020 21:28:55 -0700
Subject: [PATCH 2/2] Improve and extend asserts for a snapshot being set.

---
 src/include/utils/snapmgr.h        |  2 ++
 src/backend/access/heap/heapam.c   |  6 ++++--
 src/backend/access/index/indexam.c |  8 +++++++-
 src/backend/catalog/indexing.c     | 11 +++++++++++
 src/backend/utils/time/snapmgr.c   | 19 +++++++++++++++++++
 contrib/amcheck/verify_nbtree.c    |  6 +++---
 6 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index b28d13ce841..7738d6a8e01 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -116,6 +116,8 @@ extern void PopActiveSnapshot(void);
 extern Snapshot GetActiveSnapshot(void);
 extern bool ActiveSnapshotSet(void);
 
+extern bool SnapshotSet(void);
+
 extern Snapshot RegisterSnapshot(Snapshot snapshot);
 extern void UnregisterSnapshot(Snapshot snapshot);
 extern Snapshot RegisterSnapshotOnOwner(Snapshot snapshot, ResourceOwner owner);
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c4a5aa616a3..0af51880ccc 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1136,6 +1136,8 @@ heap_beginscan(Relation relation, Snapshot snapshot,
 {
 	HeapScanDesc scan;
 
+	Assert(SnapshotSet());
+
 	/*
 	 * increment relation ref count while scanning relation
 	 *
@@ -1545,7 +1547,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 	at_chain_start = first_call;
 	skip = !first_call;
 
-	Assert(TransactionIdIsValid(RecentGlobalXmin));
+	Assert(SnapshotSet());
 	Assert(BufferGetBlockNumber(buffer) == blkno);
 
 	/* Scan through possible multiple members of HOT-chain */
@@ -5633,7 +5635,7 @@ heap_abort_speculative(Relation relation, ItemPointer tid)
 	 * if so (vacuum can't subsequently move relfrozenxid to beyond
 	 * TransactionXmin, so there's no race here).
 	 */
-	Assert(TransactionIdIsValid(TransactionXmin));
+	Assert(SnapshotSet() && TransactionIdIsValid(TransactionXmin));
 	if (TransactionIdPrecedes(TransactionXmin, relation->rd_rel->relfrozenxid))
 		prune_xid = relation->rd_rel->relfrozenxid;
 	else
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index a3f77169a79..5d6354dedf5 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -184,6 +184,8 @@ index_insert(Relation indexRelation,
 	RELATION_CHECKS;
 	CHECK_REL_PROCEDURE(aminsert);
 
+	Assert(SnapshotSet());
+
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
 									   (ItemPointer) NULL,
@@ -256,6 +258,8 @@ index_beginscan_internal(Relation indexRelation,
 {
 	IndexScanDesc scan;
 
+	Assert(SnapshotSet());
+
 	RELATION_CHECKS;
 	CHECK_REL_PROCEDURE(ambeginscan);
 
@@ -519,7 +523,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction)
 	SCAN_CHECKS;
 	CHECK_SCAN_PROCEDURE(amgettuple);
 
-	Assert(TransactionIdIsValid(RecentGlobalXmin));
+	Assert(SnapshotSet());
 
 	/*
 	 * The AM's amgettuple proc finds the next index entry matching the scan
@@ -574,6 +578,8 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot)
 	bool		all_dead = false;
 	bool		found;
 
+	Assert(SnapshotSet());
+
 	found = table_index_fetch_tuple(scan->xs_heapfetch, &scan->xs_heaptid,
 									scan->xs_snapshot, slot,
 									&scan->xs_heap_continue, &all_dead);
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index d63fcf58cf1..8ba6b3dfa5e 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -22,6 +22,7 @@
 #include "catalog/indexing.h"
 #include "executor/executor.h"
 #include "utils/rel.h"
+#include "utils/snapmgr.h"
 
 
 /*
@@ -184,6 +185,8 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup)
 {
 	CatalogIndexState indstate;
 
+	Assert(SnapshotSet());
+
 	indstate = CatalogOpenIndexes(heapRel);
 
 	simple_heap_insert(heapRel, tup);
@@ -204,6 +207,8 @@ void
 CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup,
 						   CatalogIndexState indstate)
 {
+	Assert(SnapshotSet());
+
 	simple_heap_insert(heapRel, tup);
 
 	CatalogIndexInsert(indstate, tup);
@@ -225,6 +230,8 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup)
 {
 	CatalogIndexState indstate;
 
+	Assert(SnapshotSet());
+
 	indstate = CatalogOpenIndexes(heapRel);
 
 	simple_heap_update(heapRel, otid, tup);
@@ -245,6 +252,8 @@ void
 CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
 						   CatalogIndexState indstate)
 {
+	Assert(SnapshotSet());
+
 	simple_heap_update(heapRel, otid, tup);
 
 	CatalogIndexInsert(indstate, tup);
@@ -268,5 +277,7 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup,
 void
 CatalogTupleDelete(Relation heapRel, ItemPointer tid)
 {
+	Assert(SnapshotSet());
+
 	simple_heap_delete(heapRel, tid);
 }
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index b5cff157bf6..3b148ae30a6 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -857,6 +857,25 @@ ActiveSnapshotSet(void)
 	return ActiveSnapshot != NULL;
 }
 
+/*
+ * Does this transaction have a snapshot.
+ */
+bool
+SnapshotSet(void)
+{
+	/* can't be safe, because somehow xmin is not set */
+	if (!TransactionIdIsValid(MyPgXact->xmin) && HistoricSnapshot == NULL)
+		return false;
+
+	/*
+	 * Can't be safe because no snapshot being active/registered means that
+	 * e.g. invalidation processing could change xmin horizon.
+	 */
+	return ActiveSnapshot != NULL ||
+		!pairingheap_is_empty(&RegisteredSnapshots) ||
+		HistoricSnapshot != NULL;
+}
+
 /*
  * RegisterSnapshot
  *		Register a snapshot as being in use by the current resource owner
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index ceaaa271680..8f43f3e9dfb 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -412,10 +412,10 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
 	Snapshot	snapshot = SnapshotAny;
 
 	/*
-	 * RecentGlobalXmin assertion matches index_getnext_tid().  See note on
-	 * RecentGlobalXmin/B-Tree page deletion.
+	 * This assertion matches the one in index_getnext_tid().  See page
+	 * recycling/RecentGlobalXmin notes in nbtree README.
 	 */
-	Assert(TransactionIdIsValid(RecentGlobalXmin));
+	Assert(SnapshotSet());
 
 	/*
 	 * Initialize state for entire verification operation
-- 
2.25.0.114.g5b0ca878e0

Reply via email to