Hi, On 2020-03-28 12:30:40 -0700, Andres Freund wrote: > On 2020-02-28 22:10:52 -0800, Andres Freund wrote: > > On 2020-02-28 21:24:59 -0800, Andres Freund wrote: > > So, um. What happens is that doDeletion() does a catalog scan, which > > sets a snapshot. The results of that catalog scan are then used to > > perform modifications. But at that point there's no guarantee that we > > still hold *any* snapshot, as e.g. invalidations can trigger the catalog > > snapshot being released. > > > > I don't see how that's safe. Without ->xmin preventing that, > > intermediate row versions that we did look up could just get vacuumed > > away, and replaced with a different row. That does seem like a serious > > issue? > > > > I think there's likely a lot of places that can hit this? What makes it > > safe for InvalidateCatalogSnapshot()->SnapshotResetXmin() to release > > ->xmin when there previously has been *any* catalog access? Because in > > contrast to normal table modifications, there's currently nothing at all > > forcing us to hold a snapshot between catalog lookups an their > > modifications? > > > > Am I missing something? Or is this a fairly significant hole in our > > arrangements? > > > > The easiest way to fix this would probably be to have inval.c call a > > version of InvalidateCatalogSnapshot() that leaves the oldest catalog > > snapshot around, but sets up things so that GetCatalogSnapshot() will > > return a freshly taken snapshot? ISTM that pretty much every > > InvalidateCatalogSnapshot() call within a transaction needs that behaviour? > > I'd still like to get some input here.
Attached is a one patch that adds assertions to detect this, and one that puts enough workarounds in place to make the tests pass. I don't like this much, but I thought it'd be useful for others to understand the problem. Greetings, Andres Freund
>From a22780e3544d7c83b5ad8851de240c3d5ef8f221 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Sat, 29 Feb 2020 18:07:25 -0800 Subject: [PATCH v1 1/2] TMP: work around missing snapshot registrations. This is just what's hit by the tests. It's not an actual fix. --- src/backend/catalog/namespace.c | 7 +++++++ src/backend/catalog/pg_subscription.c | 4 ++++ src/backend/commands/indexcmds.c | 9 +++++++++ src/backend/commands/tablecmds.c | 8 ++++++++ src/backend/replication/logical/tablesync.c | 12 ++++++++++++ src/backend/replication/logical/worker.c | 4 ++++ src/backend/utils/time/snapmgr.c | 4 ++++ 7 files changed, 48 insertions(+) diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c index 2ec23016fe5..e4696d8d417 100644 --- a/src/backend/catalog/namespace.c +++ b/src/backend/catalog/namespace.c @@ -55,6 +55,7 @@ #include "utils/inval.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/snapmgr.h" #include "utils/syscache.h" #include "utils/varlena.h" @@ -4244,12 +4245,18 @@ RemoveTempRelationsCallback(int code, Datum arg) { if (OidIsValid(myTempNamespace)) /* should always be true */ { + Snapshot snap; + /* Need to ensure we have a usable transaction. */ AbortOutOfAnyTransaction(); StartTransactionCommand(); + /* ensure xmin stays set */ + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + RemoveTempRelations(myTempNamespace); + UnregisterSnapshot(snap); CommitTransactionCommand(); } } diff --git a/src/backend/catalog/pg_subscription.c b/src/backend/catalog/pg_subscription.c index cb157311154..4a324dfb4f1 100644 --- a/src/backend/catalog/pg_subscription.c +++ b/src/backend/catalog/pg_subscription.c @@ -31,6 +31,7 @@ #include "utils/fmgroids.h" #include "utils/pg_lsn.h" #include "utils/rel.h" +#include "utils/snapmgr.h" #include "utils/syscache.h" static List *textarray_to_stringlist(ArrayType *textarray); @@ -286,6 +287,7 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state, bool nulls[Natts_pg_subscription_rel]; Datum values[Natts_pg_subscription_rel]; bool replaces[Natts_pg_subscription_rel]; + Snapshot snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); LockSharedObject(SubscriptionRelationId, subid, 0, AccessShareLock); @@ -321,6 +323,8 @@ UpdateSubscriptionRelState(Oid subid, Oid relid, char state, /* Cleanup. */ table_close(rel, NoLock); + + UnregisterSnapshot(snap); } /* diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 4e8263af4be..b5fe3649a22 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -2720,6 +2720,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) char *relationName = NULL; char *relationNamespace = NULL; PGRUsage ru0; + Snapshot snap; /* * Create a memory context that will survive forced transaction commits we @@ -3189,6 +3190,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) */ StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); forboth(lc, indexIds, lc2, newIndexIds) { @@ -3237,8 +3239,11 @@ ReindexRelationConcurrently(Oid relationOid, int options) } /* Commit this transaction and make index swaps visible */ + UnregisterSnapshot(snap); CommitTransactionCommand(); + StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); /* * Phase 5 of REINDEX CONCURRENTLY @@ -3269,7 +3274,9 @@ ReindexRelationConcurrently(Oid relationOid, int options) } /* Commit this transaction to make the updates visible. */ + UnregisterSnapshot(snap); CommitTransactionCommand(); + StartTransactionCommand(); /* @@ -3283,6 +3290,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) WaitForLockersMultiple(lockTags, AccessExclusiveLock, true); PushActiveSnapshot(GetTransactionSnapshot()); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); { ObjectAddresses *objects = new_object_addresses(); @@ -3308,6 +3316,7 @@ ReindexRelationConcurrently(Oid relationOid, int options) } PopActiveSnapshot(); + UnregisterSnapshot(snap); CommitTransactionCommand(); /* diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 8e35c5bd1a2..311a950297a 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -15148,6 +15148,7 @@ PreCommit_on_commit_actions(void) ListCell *l; List *oids_to_truncate = NIL; List *oids_to_drop = NIL; + Snapshot snap; foreach(l, on_commits) { @@ -15179,6 +15180,11 @@ PreCommit_on_commit_actions(void) } } + if (oids_to_truncate == NIL && oids_to_drop == NIL) + return; + + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + /* * Truncate relations before dropping so that all dependencies between * relations are removed after they are worked on. Doing it like this @@ -15232,6 +15238,8 @@ PreCommit_on_commit_actions(void) } #endif } + + UnregisterSnapshot(snap); } /* diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index a60c6661538..5bdb15b1d50 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -864,6 +864,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) { Relation rel; WalRcvExecResult *res; + Snapshot snap; SpinLockAcquire(&MyLogicalRepWorker->relmutex); MyLogicalRepWorker->relstate = SUBREL_STATE_DATASYNC; @@ -872,10 +873,14 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) /* Update the state and make it visible to others. */ StartTransactionCommand(); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + UpdateSubscriptionRelState(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, MyLogicalRepWorker->relstate, MyLogicalRepWorker->relstate_lsn); + + UnregisterSnapshot(snap); CommitTransactionCommand(); pgstat_report_stat(false); @@ -919,6 +924,7 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) CRS_USE_SNAPSHOT, origin_startpos); PushActiveSnapshot(GetTransactionSnapshot()); + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); copy_table(rel); PopActiveSnapshot(); @@ -934,6 +940,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) /* Make the copy visible. */ CommandCounterIncrement(); + UnregisterSnapshot(snap); + /* * We are done with the initial data synchronization, update * the state. @@ -958,6 +966,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) */ if (*origin_startpos >= MyLogicalRepWorker->relstate_lsn) { + snap = RegisterSnapshot(GetCatalogSnapshot(InvalidOid)); + /* * Update the new state in catalog. No need to bother * with the shmem state as we are exiting for good. @@ -966,6 +976,8 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos) MyLogicalRepWorker->relid, SUBREL_STATE_SYNCDONE, *origin_startpos); + UnregisterSnapshot(snap); + finish_sync_worker(); } break; diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index fa3811783f6..f60b1581abf 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -989,6 +989,9 @@ apply_handle_truncate(StringInfo s) ensure_transaction(); + /* catalog modifications need a set snapshot */ + PushActiveSnapshot(GetTransactionSnapshot()); + remote_relids = logicalrep_read_truncate(s, &cascade, &restart_seqs); foreach(lc, remote_relids) @@ -1029,6 +1032,7 @@ apply_handle_truncate(StringInfo s) } CommandCounterIncrement(); + PopActiveSnapshot(); } diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 1c063c592ce..b5cff157bf6 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -441,6 +441,8 @@ GetOldestSnapshot(void) Snapshot GetCatalogSnapshot(Oid relid) { + Assert(IsTransactionState()); + /* * Return historic snapshot while we're doing logical decoding, so we can * see the appropriate state of the catalog. @@ -1017,6 +1019,8 @@ SnapshotResetXmin(void) if (pairingheap_is_empty(&RegisteredSnapshots)) { MyPgXact->xmin = InvalidTransactionId; + TransactionXmin = InvalidTransactionId; + RecentXmin = InvalidTransactionId; return; } -- 2.25.0.114.g5b0ca878e0
>From 4e213f29851ceee73f9b20846df84f6f9662714c Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Sat, 29 Feb 2020 19:33:21 -0800 Subject: [PATCH v1 2/2] Improve and extend asserts for a snapshot being set. --- src/include/utils/snapmgr.h | 2 ++ src/backend/access/heap/heapam.c | 6 ++++-- src/backend/access/index/indexam.c | 8 +++++++- src/backend/catalog/indexing.c | 11 +++++++++++ src/backend/utils/time/snapmgr.c | 19 +++++++++++++++++++ contrib/amcheck/verify_nbtree.c | 4 ++-- 6 files changed, 45 insertions(+), 5 deletions(-) diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h index b28d13ce841..7738d6a8e01 100644 --- a/src/include/utils/snapmgr.h +++ b/src/include/utils/snapmgr.h @@ -116,6 +116,8 @@ extern void PopActiveSnapshot(void); extern Snapshot GetActiveSnapshot(void); extern bool ActiveSnapshotSet(void); +extern bool SnapshotSet(void); + extern Snapshot RegisterSnapshot(Snapshot snapshot); extern void UnregisterSnapshot(Snapshot snapshot); extern Snapshot RegisterSnapshotOnOwner(Snapshot snapshot, ResourceOwner owner); diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 29694b8aa4a..912cadeb03a 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -1137,6 +1137,8 @@ heap_beginscan(Relation relation, Snapshot snapshot, { HeapScanDesc scan; + Assert(SnapshotSet()); + /* * increment relation ref count while scanning relation * @@ -1546,7 +1548,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, at_chain_start = first_call; skip = !first_call; - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); Assert(BufferGetBlockNumber(buffer) == blkno); /* Scan through possible multiple members of HOT-chain */ @@ -5629,7 +5631,7 @@ heap_abort_speculative(Relation relation, ItemPointer tid) * RecentGlobalXmin. That's not pretty, but it doesn't seem worth * inventing a nicer API for this. */ - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); PageSetPrunable(page, RecentGlobalXmin); /* store transaction information of xact deleting the tuple */ diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c index a5210d0b342..558b490d079 100644 --- a/src/backend/access/index/indexam.c +++ b/src/backend/access/index/indexam.c @@ -178,6 +178,8 @@ index_insert(Relation indexRelation, RELATION_CHECKS; CHECK_REL_PROCEDURE(aminsert); + Assert(SnapshotSet()); + if (!(indexRelation->rd_indam->ampredlocks)) CheckForSerializableConflictIn(indexRelation, (ItemPointer) NULL, @@ -250,6 +252,8 @@ index_beginscan_internal(Relation indexRelation, { IndexScanDesc scan; + Assert(SnapshotSet()); + RELATION_CHECKS; CHECK_REL_PROCEDURE(ambeginscan); @@ -513,7 +517,7 @@ index_getnext_tid(IndexScanDesc scan, ScanDirection direction) SCAN_CHECKS; CHECK_SCAN_PROCEDURE(amgettuple); - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); /* * The AM's amgettuple proc finds the next index entry matching the scan @@ -568,6 +572,8 @@ index_fetch_heap(IndexScanDesc scan, TupleTableSlot *slot) bool all_dead = false; bool found; + Assert(SnapshotSet()); + found = table_index_fetch_tuple(scan->xs_heapfetch, &scan->xs_heaptid, scan->xs_snapshot, slot, &scan->xs_heap_continue, &all_dead); diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index d63fcf58cf1..8ba6b3dfa5e 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -22,6 +22,7 @@ #include "catalog/indexing.h" #include "executor/executor.h" #include "utils/rel.h" +#include "utils/snapmgr.h" /* @@ -184,6 +185,8 @@ CatalogTupleInsert(Relation heapRel, HeapTuple tup) { CatalogIndexState indstate; + Assert(SnapshotSet()); + indstate = CatalogOpenIndexes(heapRel); simple_heap_insert(heapRel, tup); @@ -204,6 +207,8 @@ void CatalogTupleInsertWithInfo(Relation heapRel, HeapTuple tup, CatalogIndexState indstate) { + Assert(SnapshotSet()); + simple_heap_insert(heapRel, tup); CatalogIndexInsert(indstate, tup); @@ -225,6 +230,8 @@ CatalogTupleUpdate(Relation heapRel, ItemPointer otid, HeapTuple tup) { CatalogIndexState indstate; + Assert(SnapshotSet()); + indstate = CatalogOpenIndexes(heapRel); simple_heap_update(heapRel, otid, tup); @@ -245,6 +252,8 @@ void CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, CatalogIndexState indstate) { + Assert(SnapshotSet()); + simple_heap_update(heapRel, otid, tup); CatalogIndexInsert(indstate, tup); @@ -268,5 +277,7 @@ CatalogTupleUpdateWithInfo(Relation heapRel, ItemPointer otid, HeapTuple tup, void CatalogTupleDelete(Relation heapRel, ItemPointer tid) { + Assert(SnapshotSet()); + simple_heap_delete(heapRel, tid); } diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index b5cff157bf6..f7e1665aae6 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -857,6 +857,25 @@ ActiveSnapshotSet(void) return ActiveSnapshot != NULL; } +/* + * Does this transaction have a snapshot. + */ +bool +SnapshotSet(void) +{ + /* can't be safe, because somehow xmin is not set */ + if (!TransactionIdIsValid(MyPgXact->xmin) && HistoricSnapshot == NULL) + return false; + + /* + * Can't be safe because no snapshot being registered means invalidation + * processing can change xmin horizon. + */ + return ActiveSnapshot != NULL || + !pairingheap_is_empty(&RegisteredSnapshots) || + HistoricSnapshot != NULL; +} + /* * RegisterSnapshot * Register a snapshot as being in use by the current resource owner diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c index ceaaa271680..50a46dca933 100644 --- a/contrib/amcheck/verify_nbtree.c +++ b/contrib/amcheck/verify_nbtree.c @@ -412,10 +412,10 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace, Snapshot snapshot = SnapshotAny; /* - * RecentGlobalXmin assertion matches index_getnext_tid(). See note on * RecentGlobalXmin/B-Tree page deletion. + * This assertion matches the one in index_getnext_tid(). See note on */ - Assert(TransactionIdIsValid(RecentGlobalXmin)); + Assert(SnapshotSet()); /* * Initialize state for entire verification operation -- 2.25.0.114.g5b0ca878e0