On Thu, May 08, 2025 at 08:55:08AM +0900, Michael Paquier wrote: > On Wed, May 07, 2025 at 02:55:49PM -0500, Nathan Bossart wrote: >> Committed with that change. That takes care of a good chunk of these TOAST >> snapshot problems. I think there are about 4 others left, unless something >> has changed since I last checked. I hope to look into those again soon. > > Thanks, for both things.
Here is what I have staged for commit for the others. I'm hoping to push these in the next couple of days. -- nathan
>From ac77516715c7d05e94f585a08253e43c64a91382 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/access/heap/heapam.c | 29 ++++++++++++++++++++++++ src/backend/commands/indexcmds.c | 2 +- src/backend/commands/tablecmds.c | 8 +++++++ src/backend/postmaster/autovacuum.c | 7 ++++++ src/backend/replication/logical/worker.c | 24 ++++++++++++++++++++ 5 files changed, 69 insertions(+), 1 deletion(-) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 9ec8cda1c68..2be7f817c78 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -213,6 +213,27 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] = #define TUPLOCK_from_mxstatus(status) \ (MultiXactStatusLock[(status)]) +/* + * Check that we have a valid snapshot if we might need TOAST access. + */ +static inline void +AssertHasSnapshotForToast(Relation rel) +{ +#ifdef USE_ASSERT_CHECKING + + /* bootstrap mode in particular breaks this rule */ + if (!IsNormalProcessingMode()) + return; + + /* if the relation doesn't have a TOAST table, we are good */ + if (!OidIsValid(rel->rd_rel->reltoastrelid)) + return; + + Assert(HaveRegisteredOrActiveSnapshot()); + +#endif /* USE_ASSERT_CHECKING */ +} + /* ---------------------------------------------------------------- * heap support routines * ---------------------------------------------------------------- @@ -2066,6 +2087,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, Assert(HeapTupleHeaderGetNatts(tup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Fill in tuple header fields and toast the tuple if necessary. * @@ -2343,6 +2366,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, /* currently not needed (thus unsupported) for heap_multi_insert() */ Assert(!(options & HEAP_INSERT_NO_LOGICAL)); + AssertHasSnapshotForToast(relation); + needwal = RelationNeedsWAL(relation); saveFreeSpace = RelationGetTargetPageFreeSpace(relation, HEAP_DEFAULT_FILLFACTOR); @@ -2765,6 +2790,8 @@ heap_delete(Relation relation, ItemPointer tid, Assert(ItemPointerIsValid(tid)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we @@ -3260,6 +3287,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, Assert(HeapTupleHeaderGetNatts(newtup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index d962fe392cd..c3ec2076a52 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -4226,7 +4226,7 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein false); /* - * Updating pg_index might involve TOAST table access, so ensure we + * Swapping the indexes might involve TOAST table access, so ensure we * have a valid snapshot. */ PushActiveSnapshot(GetTransactionSnapshot()); diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 54ad38247aa..acf11e83c04 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -20964,9 +20964,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* + * Detaching the partition might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 981be42e3af..451fb90a610 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2234,6 +2234,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2246,6 +2252,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 4151a4b2a96..a23262957ac 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -4626,8 +4626,16 @@ run_apply_worker() walrcv_startstreaming(LogRepWorkerWalRcvConn, &options); StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so + * ensure we have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED); MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED; + PopActiveSnapshot(); CommitTransactionCommand(); } else @@ -4843,7 +4851,15 @@ DisableSubscriptionAndExit(void) /* Disable the subscription */ StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + DisableSubscription(MySubscription->oid); + PopActiveSnapshot(); CommitTransactionCommand(); /* Ensure we remove no-longer-useful entry for worker's start time */ @@ -4947,6 +4963,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) started_tx = true; } + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Protect subskiplsn of pg_subscription from being concurrently updated * while clearing it. @@ -5005,6 +5027,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) heap_freetuple(tup); table_close(rel, NoLock); + PopActiveSnapshot(); + if (started_tx) CommitTransactionCommand(); } -- 2.39.5 (Apple Git-154)
>From 04177a63cffeecb0004f84575df5935f2d1ab3ee Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/commands/indexcmds.c | 8 ++++++++ src/backend/postmaster/autovacuum.c | 7 +++++++ 2 files changed, 15 insertions(+) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index e0295d7b8ae..f7c09a63242 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -3519,12 +3519,20 @@ ReindexRelationConcurrently(Oid relationOid, int options) get_rel_namespace(heapId), false); + /* + * Swapping the indexes might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Swap old index with the new one. This also marks the new one as * valid and the old one as not valid. */ index_concurrently_swap(newIndexId, oldIndexId, oldName); + PopActiveSnapshot(); + /* * Invalidate the relcache for the table, so that after this commit * all sessions will refresh any cached plans that might reference the diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 59fcc8b93f0..23cca675f00 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2294,6 +2294,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2306,6 +2312,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); -- 2.39.5 (Apple Git-154)
>From 56dd3903c2503a4ae5727107489b2c5082e62ef4 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/commands/indexcmds.c | 8 ++++++++ src/backend/commands/tablecmds.c | 8 ++++++++ src/backend/postmaster/autovacuum.c | 7 +++++++ 3 files changed, 23 insertions(+) diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index c2904cdb1a7..b658642ee93 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -4005,12 +4005,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params) get_rel_namespace(oldidx->tableId), false); + /* + * Swapping the indexes might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Swap old index with the new one. This also marks the new one as * valid and the old one as not valid. */ index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName); + PopActiveSnapshot(); + /* * Invalidate the relcache for the table, so that after this commit * all sessions will refresh any cached plans that might reference the diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index b411b87b91c..ba3cc478016 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -18220,9 +18220,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* + * Detaching the partition might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index e70f32d0ab4..b4e529efd82 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2321,6 +2321,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2333,6 +2339,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); -- 2.39.5 (Apple Git-154)
>From 405825223e8a2d896e0f8cda5a5fe0c978a8c711 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++ src/backend/commands/indexcmds.c | 8 +++++ src/backend/commands/tablecmds.c | 8 +++++ src/backend/postmaster/autovacuum.c | 7 ++++ src/backend/replication/logical/worker.c | 24 ++++++++++++++ 5 files changed, 88 insertions(+) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 8aaddcbb30b..ba737c5b2ee 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -54,6 +54,7 @@ #include "catalog/catalog.h" #include "catalog/pg_database.h" #include "catalog/pg_database_d.h" +#include "catalog/pg_replication_origin.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -227,6 +228,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] = #define TUPLOCK_from_mxstatus(status) \ (MultiXactStatusLock[(status)]) +/* + * Check that we have a valid snapshot if we might need TOAST access. + */ +static inline void +AssertHasSnapshotForToast(Relation rel) +{ +#ifdef USE_ASSERT_CHECKING + + /* bootstrap mode in particular breaks this rule */ + if (!IsNormalProcessingMode()) + return; + + /* if the relation doesn't have a TOAST table, we are good */ + if (!OidIsValid(rel->rd_rel->reltoastrelid)) + return; + + /* + * Commit 16bf24e fixed accesses to pg_replication_origin without a + * an active snapshot by removing its TOAST table. On older branches, + * these bugs are left in place. Its only varlena column is roname (the + * replication origin name), so this is only a problem if the name + * requires out-of-line storage, which seems unlikely. In any case, + * fixing it doesn't seem worth extra code churn on the back-branches. + */ + if (RelationGetRelid(rel) == ReplicationOriginRelationId) + return; + + Assert(HaveRegisteredOrActiveSnapshot()); + +#endif /* USE_ASSERT_CHECKING */ +} + /* ---------------------------------------------------------------- * heap support routines * ---------------------------------------------------------------- @@ -2047,6 +2080,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, Assert(HeapTupleHeaderGetNatts(tup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Fill in tuple header fields and toast the tuple if necessary. * @@ -2294,6 +2329,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, /* currently not needed (thus unsupported) for heap_multi_insert() */ AssertArg(!(options & HEAP_INSERT_NO_LOGICAL)); + AssertHasSnapshotForToast(relation); + needwal = RelationNeedsWAL(relation); saveFreeSpace = RelationGetTargetPageFreeSpace(relation, HEAP_DEFAULT_FILLFACTOR); @@ -2697,6 +2734,8 @@ heap_delete(Relation relation, ItemPointer tid, Assert(ItemPointerIsValid(tid)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we @@ -3188,6 +3227,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, Assert(HeapTupleHeaderGetNatts(newtup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 77aa99496fa..1e3af0093b4 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -3993,12 +3993,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params) get_rel_namespace(oldidx->tableId), false); + /* + * Swapping the indexes might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Swap old index with the new one. This also marks the new one as * valid and the old one as not valid. */ index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName); + PopActiveSnapshot(); + /* * Invalidate the relcache for the table, so that after this commit * all sessions will refresh any cached plans that might reference the diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index ab3ab794083..4f7be0247c2 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -18810,9 +18810,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* + * Detaching the partition might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index c511a132263..94c93c6e4b5 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2315,6 +2315,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2327,6 +2333,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index dcc3fdf6c76..155bf7c44ee 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -3794,8 +3794,16 @@ ApplyWorkerMain(Datum main_arg) walrcv_startstreaming(LogRepWorkerWalRcvConn, &options); StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so + * ensure we have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED); MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED; + PopActiveSnapshot(); CommitTransactionCommand(); } else @@ -3848,7 +3856,15 @@ DisableSubscriptionAndExit(void) /* Disable the subscription */ StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + DisableSubscription(MySubscription->oid); + PopActiveSnapshot(); CommitTransactionCommand(); /* Notify the subscription has been disabled and exit */ @@ -3939,6 +3955,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) started_tx = true; } + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Protect subskiplsn of pg_subscription from being concurrently updated * while clearing it. @@ -3997,6 +4019,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) heap_freetuple(tup); table_close(rel, NoLock); + PopActiveSnapshot(); + if (started_tx) CommitTransactionCommand(); } -- 2.39.5 (Apple Git-154)
>From c06dfa22c789b177cdc6a33530b08ffd671ea84b Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++ src/backend/commands/indexcmds.c | 8 +++++ src/backend/commands/tablecmds.c | 8 +++++ src/backend/postmaster/autovacuum.c | 7 ++++ src/backend/replication/logical/worker.c | 24 ++++++++++++++ 5 files changed, 88 insertions(+) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 429dc2cebe3..6e7e07929d2 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -54,6 +54,7 @@ #include "catalog/catalog.h" #include "catalog/pg_database.h" #include "catalog/pg_database_d.h" +#include "catalog/pg_replication_origin.h" #include "commands/vacuum.h" #include "miscadmin.h" #include "pgstat.h" @@ -231,6 +232,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] = #define TUPLOCK_from_mxstatus(status) \ (MultiXactStatusLock[(status)]) +/* + * Check that we have a valid snapshot if we might need TOAST access. + */ +static inline void +AssertHasSnapshotForToast(Relation rel) +{ +#ifdef USE_ASSERT_CHECKING + + /* bootstrap mode in particular breaks this rule */ + if (!IsNormalProcessingMode()) + return; + + /* if the relation doesn't have a TOAST table, we are good */ + if (!OidIsValid(rel->rd_rel->reltoastrelid)) + return; + + /* + * Commit 16bf24e fixed accesses to pg_replication_origin without a + * an active snapshot by removing its TOAST table. On older branches, + * these bugs are left in place. Its only varlena column is roname (the + * replication origin name), so this is only a problem if the name + * requires out-of-line storage, which seems unlikely. In any case, + * fixing it doesn't seem worth extra code churn on the back-branches. + */ + if (RelationGetRelid(rel) == ReplicationOriginRelationId) + return; + + Assert(HaveRegisteredOrActiveSnapshot()); + +#endif /* USE_ASSERT_CHECKING */ +} + /* ---------------------------------------------------------------- * heap support routines * ---------------------------------------------------------------- @@ -1858,6 +1891,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, Assert(HeapTupleHeaderGetNatts(tup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Fill in tuple header fields and toast the tuple if necessary. * @@ -2135,6 +2170,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, /* currently not needed (thus unsupported) for heap_multi_insert() */ Assert(!(options & HEAP_INSERT_NO_LOGICAL)); + AssertHasSnapshotForToast(relation); + needwal = RelationNeedsWAL(relation); saveFreeSpace = RelationGetTargetPageFreeSpace(relation, HEAP_DEFAULT_FILLFACTOR); @@ -2557,6 +2594,8 @@ heap_delete(Relation relation, ItemPointer tid, Assert(ItemPointerIsValid(tid)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we @@ -3052,6 +3091,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, Assert(HeapTupleHeaderGetNatts(newtup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index ea3967b7c28..cf6ef3a955d 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -4064,12 +4064,20 @@ ReindexRelationConcurrently(Oid relationOid, ReindexParams *params) get_rel_namespace(oldidx->tableId), false); + /* + * Swapping the indexes might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Swap old index with the new one. This also marks the new one as * valid and the old one as not valid. */ index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName); + PopActiveSnapshot(); + /* * Invalidate the relcache for the table, so that after this commit * all sessions will refresh any cached plans that might reference the diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 633a9c8317c..cfb64768303 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -18789,9 +18789,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* + * Detaching the partition might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 4a5df488860..256c25af3b1 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2372,6 +2372,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2384,6 +2390,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 363085b46a1..4df0a594dc9 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -4725,8 +4725,16 @@ ApplyWorkerMain(Datum main_arg) walrcv_startstreaming(LogRepWorkerWalRcvConn, &options); StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so + * ensure we have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED); MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED; + PopActiveSnapshot(); CommitTransactionCommand(); } else @@ -4779,7 +4787,15 @@ DisableSubscriptionAndExit(void) /* Disable the subscription */ StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + DisableSubscription(MySubscription->oid); + PopActiveSnapshot(); CommitTransactionCommand(); /* Ensure we remove no-longer-useful entry for worker's start time */ @@ -4883,6 +4899,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) started_tx = true; } + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Protect subskiplsn of pg_subscription from being concurrently updated * while clearing it. @@ -4941,6 +4963,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) heap_freetuple(tup); table_close(rel, NoLock); + PopActiveSnapshot(); + if (started_tx) CommitTransactionCommand(); } -- 2.39.5 (Apple Git-154)
>From f7bff0b37a00c34167898f31dd2ef0405b8af1da Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nat...@postgresql.org> Date: Tue, 27 May 2025 14:58:37 -0500 Subject: [PATCH v6 1/1] Ensure we have a snapshot when updating various system catalogs. A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, it didn't seem worth the trouble of significantly modifying the patch on the back-branches. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not backpatched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <exclus...@gmail.com> Reviewed-by: Michael Paquier <mich...@paquier.xyz> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13 --- src/backend/access/heap/heapam.c | 41 ++++++++++++++++++++++++ src/backend/commands/indexcmds.c | 8 +++++ src/backend/commands/tablecmds.c | 8 +++++ src/backend/postmaster/autovacuum.c | 7 ++++ src/backend/replication/logical/worker.c | 24 ++++++++++++++ 5 files changed, 88 insertions(+) diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 95e3be524a7..ccf151f548b 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -53,6 +53,7 @@ #include "catalog/catalog.h" #include "catalog/pg_database.h" #include "catalog/pg_database_d.h" +#include "catalog/pg_replication_origin.h" #include "commands/vacuum.h" #include "miscadmin.h" #include "pgstat.h" @@ -230,6 +231,38 @@ static const int MultiXactStatusLock[MaxMultiXactStatus + 1] = #define TUPLOCK_from_mxstatus(status) \ (MultiXactStatusLock[(status)]) +/* + * Check that we have a valid snapshot if we might need TOAST access. + */ +static inline void +AssertHasSnapshotForToast(Relation rel) +{ +#ifdef USE_ASSERT_CHECKING + + /* bootstrap mode in particular breaks this rule */ + if (!IsNormalProcessingMode()) + return; + + /* if the relation doesn't have a TOAST table, we are good */ + if (!OidIsValid(rel->rd_rel->reltoastrelid)) + return; + + /* + * Commit 16bf24e fixed accesses to pg_replication_origin without a + * an active snapshot by removing its TOAST table. On older branches, + * these bugs are left in place. Its only varlena column is roname (the + * replication origin name), so this is only a problem if the name + * requires out-of-line storage, which seems unlikely. In any case, + * fixing it doesn't seem worth extra code churn on the back-branches. + */ + if (RelationGetRelid(rel) == ReplicationOriginRelationId) + return; + + Assert(HaveRegisteredOrActiveSnapshot()); + +#endif /* USE_ASSERT_CHECKING */ +} + /* ---------------------------------------------------------------- * heap support routines * ---------------------------------------------------------------- @@ -2015,6 +2048,8 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid, Assert(HeapTupleHeaderGetNatts(tup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Fill in tuple header fields and toast the tuple if necessary. * @@ -2292,6 +2327,8 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples, /* currently not needed (thus unsupported) for heap_multi_insert() */ Assert(!(options & HEAP_INSERT_NO_LOGICAL)); + AssertHasSnapshotForToast(relation); + needwal = RelationNeedsWAL(relation); saveFreeSpace = RelationGetTargetPageFreeSpace(relation, HEAP_DEFAULT_FILLFACTOR); @@ -2714,6 +2751,8 @@ heap_delete(Relation relation, ItemPointer tid, Assert(ItemPointerIsValid(tid)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we @@ -3209,6 +3248,8 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, Assert(HeapTupleHeaderGetNatts(newtup->t_data) <= RelationGetNumberOfAttributes(relation)); + AssertHasSnapshotForToast(relation); + /* * Forbid this during a parallel operation, lest it allocate a combo CID. * Other workers might need that combo CID for visibility checks, and we diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index f7fe1fae91c..08a056bc0f8 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -4094,12 +4094,20 @@ ReindexRelationConcurrently(const ReindexStmt *stmt, Oid relationOid, const Rein get_rel_namespace(oldidx->tableId), false); + /* + * Swapping the indexes might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Swap old index with the new one. This also marks the new one as * valid and the old one as not valid. */ index_concurrently_swap(newidx->indexId, oldidx->indexId, oldName); + PopActiveSnapshot(); + /* * Invalidate the relcache for the table, so that after this commit * all sessions will refresh any cached plans that might reference the diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index e6606e5e57a..a4e0fd06781 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -19268,9 +19268,17 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo *tab, Relation rel, tab->rel = rel; } + /* + * Detaching the partition might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* Do the final part of detaching */ DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid); + PopActiveSnapshot(); + ObjectAddressSet(address, RelationRelationId, RelationGetRelid(partRel)); /* keep our lock until commit */ diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 0e30a7fda66..ec5699e48e8 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2220,6 +2220,12 @@ do_autovacuum(void) get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); + /* + * Deletion might involve TOAST table access, so ensure we have a + * valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + object.classId = RelationRelationId; object.objectId = relid; object.objectSubId = 0; @@ -2232,6 +2238,7 @@ do_autovacuum(void) * To commit the deletion, end current transaction and start a new * one. Note this also releases the locks we took. */ + PopActiveSnapshot(); CommitTransactionCommand(); StartTransactionCommand(); diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index db09978697f..1bff6c92dda 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -4550,8 +4550,16 @@ run_apply_worker() walrcv_startstreaming(LogRepWorkerWalRcvConn, &options); StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so + * ensure we have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + UpdateTwoPhaseState(MySubscription->oid, LOGICALREP_TWOPHASE_STATE_ENABLED); MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED; + PopActiveSnapshot(); CommitTransactionCommand(); } else @@ -4767,7 +4775,15 @@ DisableSubscriptionAndExit(void) /* Disable the subscription */ StartTransactionCommand(); + + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + DisableSubscription(MySubscription->oid); + PopActiveSnapshot(); CommitTransactionCommand(); /* Ensure we remove no-longer-useful entry for worker's start time */ @@ -4871,6 +4887,12 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) started_tx = true; } + /* + * Updating pg_subscription might involve TOAST table access, so ensure we + * have a valid snapshot. + */ + PushActiveSnapshot(GetTransactionSnapshot()); + /* * Protect subskiplsn of pg_subscription from being concurrently updated * while clearing it. @@ -4929,6 +4951,8 @@ clear_subscription_skip_lsn(XLogRecPtr finish_lsn) heap_freetuple(tup); table_close(rel, NoLock); + PopActiveSnapshot(); + if (started_tx) CommitTransactionCommand(); } -- 2.39.5 (Apple Git-154)