Hi, Robert, Tom, it'd be great if you could look through this thread. I think there's a problem here (and it has gotten worse after the introduction of catalog snapshots). Both of you at least dabbled in related code.
On 2020-02-29 12:17:07 -0800, Andres Freund wrote: > On 2020-02-28 22:10:52 -0800, Andres Freund wrote: > > So, um. What happens is that doDeletion() does a catalog scan, which > > sets a snapshot. The results of that catalog scan are then used to > > perform modifications. But at that point there's no guarantee that we > > still hold *any* snapshot, as e.g. invalidations can trigger the catalog > > snapshot being released. > > > > I don't see how that's safe. Without ->xmin preventing that, > > intermediate row versions that we did look up could just get vacuumed > > away, and replaced with a different row. That does seem like a serious > > issue? > > > > I think there's likely a lot of places that can hit this? What makes it > > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release > > ->xmin when there previously has been *any* catalog access? Because in > > contrast to normal table modifications, there's currently nothing at all > > forcing us to hold a snapshot between catalog lookups an their > > modifications? > > > > Am I missing something? Or is this a fairly significant hole in our > > arrangements? > > I still think that's true. In a first iteration I hacked around the > problem by explicitly registering a catalog snapshot in > RemoveTempRelations(). That *sometimes* allows to get through the > regression tests without the assertions triggering. The attached two patches (they're not meant to be applied) reliably get through the regression tests. But I suspect I'd have to at least do a CLOBBER_CACHE_ALWAYS run to find all the actually vulnerable places. > But I don't think that's good enough (even if we fixed the other > potential crashes similarly). The only reason that avoids the asserts is > because in nearly all other cases there's also a user snapshot that's > pushed. But that pushed snapshot can have an xmin that's newer than the > catalog snapshot, which means we're still in danger of tids from catalog > scans being outdated. > > My preliminary conclusion is that it's simply not safe to do > SnapshotResetXmin() from within InvalidateCatalogSnapshot(), > PopActiveSnapshot(), UnregisterSnapshotFromOwner() etc. Instead we need > to defer the SnapshotResetXmin() call until at least > CommitTransactionCommand()? Outside of that there ought (with exception > of multi-transaction commands, but they have to be careful anyway) to be > no "in progress" sequences of related catalog lookups/modifications. > > Alternatively we could ensure that all catalog lookup/mod sequences > ensure that the first catalog snapshot is registered. But that seems > like a gargantuan task? I also just noticed comments of this style in a few places * Start a transaction so we can access pg_database, and get a snapshot. * We don't have a use for the snapshot itself, but we're interested in * the secondary effect that it sets RecentGlobalXmin. (This is critical * for anything that reads heap pages, because HOT may decide to prune * them even if the process doesn't attempt to modify any tuples.) followed by code like StartTransactionCommand(); (void) GetTransactionSnapshot(); rel = table_open(DatabaseRelationId, AccessShareLock); scan = table_beginscan_catalog(rel, 0, NULL); which is not safe at all, unfortunately. The snapshot is not pushed/active, therefore invalidations processed e.g. as part of the table_open() could execute a InvalidateCatalogSnapshot(), which in turn would remove the catalog snapshot from the pairing heap and SnapshotResetXmin(). And poof, the backend's xmin is gone. Greetings, Andres Freund
>From 3b58990c088936122f38d855a5a3900602deacf7 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 6 Apr 2020 21:28:55 -0700 Subject: [PATCH 1/2] TMP: work around missing snapshot registrations. This is just what's hit by the tests. It's not an actual fix. --- src/backend/catalog/namespace.c | 7 +++++++ src/backend/catalog/pg_subscription.c | 4 ++++ src/backend/commands/indexcmds.c | 9 +++++++++ src/backend/commands/tablecmds.c | 8 ++++++++ src/backend/replication/logical/tablesync.c | 12 ++++++++++++ src/backend/replication/logical/worker.c | 4 ++++ src/backend/utils/time/snapmgr.c | 4 ++++ 7 files changed, 48 insertions(+) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index 2ec23016fe5..e4696d8d417 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -55,6 +55,7 @@ #include "utils/inval.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/snapmgr.h" #include "utils/syscache.h" #include "utils/varlena.h" @@ -4244,12 +4245,18 @@ RemoveTempRelationsCallback(int code, Datum arg) { if (OidIsValid(myTempNamespace)) /* should always be true */ { + Snapshot snap; + /* Need to ensure we have a usable transaction. */ AbortOutOfAnyTransaction(); StartTransactionCommand(); + /* ensure xmin stays set */ + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + RemoveTempRelations(myTempNamespace); + UnregisterSnapshot(snap); CommitTransactionCommand(); } } diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c index cb157311154..4a324dfb4f1 100644 --- a/src/backend/catalog/pg_subscription.c +++ b/src/backend/catalog/pg_subscription.c @@ -31,6 +31,7 @@ #include "utils/fmgroids.h" #include "utils/pg_lsn.h" #include "utils/rel.h" +#include "utils/snapmgr.h" #include "utils/syscache.h" static List *textarray_to_stringlist(ArrayType *textarray); @@ -286,6 +287,7 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state, bool nulls[Natts_pg_subscription_rel]; Datum values[Natts_pg_subscription_rel]; bool replaces[Natts_pg_subscription_rel]; + Snapshot snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock); @@ -321,6 +323,8 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state, /* Cleanup. */ table_close(rel, NoLock); + + UnregisterSnapshot(snap); } /* diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 2baca12c5f4..094bf6139f0 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -2837,6 +2837,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) char *relationName = NULL; char *relationNamespace = NULL; PGRUsage ru0; + Snapshot snap; /* * Create a memory context that will survive forced transaction commits we @@ -3306,6 +3307,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) */ StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); forboth(lc, indexIds, lc2, newIndexIds) { @@ -3354,8 +3356,11 @@ ReindexRelationConcurrently(Oid relationOid, int options) } /* Commit this transaction and make index swaps visible */ + UnregisterSnapshot(snap); CommitTransactionCommand(); + StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); /* * Phase 5 of REINDEX CONCURRENTLY @@ -3386,7 +3391,9 @@ ReindexRelationConcurrently(Oid relationOid, int options) } /* Commit this transaction to make the updates visible. */ + UnregisterSnapshot(snap); CommitTransactionCommand(); + StartTransactionCommand(); /* @@ -3400,6 +3407,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) WaitForLockersMultiple(lockTags, AccessExclusiveLock, true); PushActiveSnapshot(GetTransactionSnapshot()); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); { ObjectAddresses *objects = new_object_addresses(); @@ -3425,6 +3433,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) } PopActiveSnapshot(); + UnregisterSnapshot(snap); CommitTransactionCommand(); /* diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 6162fb018c7..e1eacc6a4a6 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -15200,6 +15200,7 @@ PreCommit_on_commit_actions(void) ListCell *l; List *oids_to_truncate = NIL; List *oids_to_drop = NIL; + Snapshot snap; foreach(l, on_commits) { @@ -15231,6 +15232,11 @@ PreCommit_on_commit_actions(void) } } + if (oids_to_truncate == NIL && oids_to_drop == NIL) + return; + + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + /* * Truncate relations before dropping so that all dependencies between * relations are removed after they are worked on. Doing it like this @@ -15284,6 +15290,8 @@ PreCommit_on_commit_actions(void) } #endif } + + UnregisterSnapshot(snap); } /* diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index c27d9705895..aec5a044790 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -863,6 +863,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) { Relation rel; WalRcvExecResult *res; + Snapshot snap; SpinLockAcquire(&MyLogicalRepWorker->relmutex); MyLogicalRepWorker->relstate = SUBREL_STATE_DATASYNC; @@ -871,10 +872,14 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) /* Update the state and make it visible to others. */ StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + UpdateSubscriptionRelState(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, MyLogicalRepWorker->relstate, MyLogicalRepWorker->relstate_lsn); + + UnregisterSnapshot(snap); CommitTransactionCommand(); pgstat_report_stat(false); @@ -918,6 +923,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) CRS_USE_SNAPSHOT, origin_startpos); PushActiveSnapshot(GetTransactionSnapshot()); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); copy_table(rel); PopActiveSnapshot(); @@ -933,6 +939,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) /* Make the copy visible. */ CommandCounterIncrement(); + UnregisterSnapshot(snap); + /* * We are done with the initial data synchronization, update * the state. @@ -957,6 +965,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) */ if (*origin_startpos >= MyLogicalRepWorker->relstate_lsn) { + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + /* * Update the new state in catalog. No need to bother * with the shmem state as we are exiting for good. @@ -965,6 +975,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) MyLogicalRepWorker->relid, SUBREL_STATE_SYNCDONE, *origin_startpos); + UnregisterSnapshot(snap); + finish_sync_worker(); } break; diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index a752a1224d6..f10f3f843d1 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -1245,6 +1245,9 @@ apply_handle_truncate(StringInfo s) ensure_transaction(); + /* catalog modifications need a set snapshot */ + PushActiveSnapshot(GetTransactionSnapshot()); + remote_relids = logicalrep_read_truncate(s, &cascade, &restart_seqs); foreach(lc, remote_relids) @@ -1332,6 +1335,7 @@ apply_handle_truncate(StringInfo s) } CommandCounterIncrement(); + PopActiveSnapshot(); } diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 1c063c592ce..b5cff157bf6 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -441,6 +441,8 @@ GetOldestSnapshot(void) Snapshot GetCatalogSnapshot(Oid relid) { + Assert(IsTransactionState()); + /* * Return historic snapshot while we're doing logical decoding, so we can * see the appropriate state of the catalog. @@ -1017,6 +1019,8 @@ SnapshotResetXmin(void) if (pairingheap_is_empty(&RegisteredSnapshots)) { MyPgXact->xmin = InvalidTransactionId; + TransactionXmin = InvalidTransactionId; + RecentXmin = InvalidTransactionId; return; } -- 2.25.0.114.g5b0ca878e0
>From 076c589dff7e08f0a6b562b185f179da4fbfc13a Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Mon, 6 Apr 2020 21:28:55 -0700 Subject: [PATCH 2/2] Improve and extend asserts for a snapshot being set. --- src/include/utils/snapmgr.h | 2 ++ src/backend/access/heap/heapam.c | 6 ++++-- src/backend/access/index/indexam.c | 8 +++++++- src/backend/catalog/indexing.c | 11 +++++++++++ src/backend/utils/time/snapmgr.c | 19 +++++++++++++++++++ contrib/amcheck/verify_nbtree.c | 6 +++--- 6 files changed, 46 insertions(+), 6 deletions(-) diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h index b28d13ce841..7738d6a8e01 100644 --- a/src/include/utils/snapmgr.h +++ b/src/include/utils/snapmgr.h @@ -116,6 +116,8 @@ extern void PopActiveSnapshot(void); extern Snapshot GetActiveSnapshot(void); extern bool ActiveSnapshotSet(void); +extern bool SnapshotSet(void); + extern Snapshot RegisterSnapshot(Snapshot snapshot); extern void UnregisterSnapshot(Snapshot snapshot); extern Snapshot RegisterSnapshotOnOwner(Snapshot snapshot, ResourceOwner owner); diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index c4a5aa616a3..0af51880ccc 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -1136,6 +1136,8 @@ heap_beginscan(Relation relation, Snapshot snapshot, { HeapScanDesc scan; + Assert(SnapshotSet()); + /* * increment relation ref count while scanning relation * @@ -1545,7 +1547,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, at_chain_start = first_call; skip = !first_call; - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); Assert(BufferGetBlockNumber(buffer) == blkno); /* Scan through possible multiple members of HOT-chain */ @@ -5633,7 +5635,7 @@ heap_abort_speculative(Relation relation, ItemPointer tid) * if so (vacuum can't subsequently move relfrozenxid to beyond * TransactionXmin, so there's no race here). */ - Assert(TransactionIdIsValid(TransactionXmin)); + Assert(SnapshotSet() && TransactionIdIsValid(TransactionXmin)); if (TransactionIdPrecedes(TransactionXmin, relation->rd_rel->relfrozenxid)) prune_xid = relation->rd_rel->relfrozenxid; else diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index a3f77169a79..5d6354dedf5 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -184,6 +184,8 @@ index_insert(Relation indexRelation, RELATION_CHECKS; CHECK_REL_PROCEDURE(aminsert); + Assert(SnapshotSet()); + if (!(indexRelation->rd_indam->ampredlocks)) CheckForSerializableConflictIn(indexRelation, (ItemPointer) NULL, @@ -256,6 +258,8 @@ index_beginscan_internal(Relation indexRelation, { IndexScanDesc scan; + Assert(SnapshotSet()); + RELATION_CHECKS; CHECK_REL_PROCEDURE(ambeginscan); @@ -519,7 +523,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction) SCAN_CHECKS; CHECK_SCAN_PROCEDURE(amgettuple); - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); /* * The AM's amgettuple proc finds the next index entry matching the scan @@ -574,6 +578,8 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot) bool all_dead = false; bool found; + Assert(SnapshotSet()); + found = table_index_fetch_tuple(scan->xs_heapfetch, &scan->xs_heaptid, scan->xs_snapshot, slot, &scan->xs_heap_continue, &all_dead); diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index d63fcf58cf1..8ba6b3dfa5e 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -22,6 +22,7 @@ #include "catalog/indexing.h" #include "executor/executor.h" #include "utils/rel.h" +#include "utils/snapmgr.h" /* @@ -184,6 +185,8 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup) { CatalogIndexState indstate; + Assert(SnapshotSet()); + indstate = CatalogOpenIndexes(heapRel); simple_heap_insert(heapRel, tup); @@ -204,6 +207,8 @@ void CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate) { + Assert(SnapshotSet()); + simple_heap_insert(heapRel, tup); CatalogIndexInsert(indstate, tup); @@ -225,6 +230,8 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup) { CatalogIndexState indstate; + Assert(SnapshotSet()); + indstate = CatalogOpenIndexes(heapRel); simple_heap_update(heapRel, otid, tup); @@ -245,6 +252,8 @@ void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, CatalogIndexState indstate) { + Assert(SnapshotSet()); + simple_heap_update(heapRel, otid, tup); CatalogIndexInsert(indstate, tup); @@ -268,5 +277,7 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, void CatalogTupleDelete(Relation heapRel, ItemPointer tid) { + Assert(SnapshotSet()); + simple_heap_delete(heapRel, tid); } diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index b5cff157bf6..3b148ae30a6 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -857,6 +857,25 @@ ActiveSnapshotSet(void) return ActiveSnapshot != NULL; } +/* + * Does this transaction have a snapshot. + */ +bool +SnapshotSet(void) +{ + /* can't be safe, because somehow xmin is not set */ + if (!TransactionIdIsValid(MyPgXact->xmin) && HistoricSnapshot == NULL) + return false; + + /* + * Can't be safe because no snapshot being active/registered means that + * e.g. invalidation processing could change xmin horizon. + */ + return ActiveSnapshot != NULL || + !pairingheap_is_empty(&RegisteredSnapshots) || + HistoricSnapshot != NULL; +} + /* * RegisterSnapshot * Register a snapshot as being in use by the current resource owner diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index ceaaa271680..8f43f3e9dfb 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -412,10 +412,10 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, Snapshot snapshot = SnapshotAny; /* - * RecentGlobalXmin assertion matches index_getnext_tid(). See note on - * RecentGlobalXmin/B-Tree page deletion. + * This assertion matches the one in index_getnext_tid(). See page + * recycling/RecentGlobalXmin notes in nbtree README. */ - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); /* * Initialize state for entire verification operation -- 2.25.0.114.g5b0ca878e0