Hi, When finishing the prepared transaction, we are marking it as "not valid" after the WAL record has been written. This allows us to be sure that even if we fail later, no one will be able to commit/rollback this transaction again. Here is the code that implements it :
``` /* * In case we fail while running the callbacks, mark the gxact invalid so * no one else will try to commit/rollback, and so it will be recycled if * we fail after this point. It is still locked by our backend so it * won't go away yet. * * (We assume it's safe to do this without taking TwoPhaseStateLock.) */ gxact->valid = false; ``` Comment says that we can fail while running the callbacks. But if this really happens, then we are in trouble : 1) Twophase transaction we are working on is invalid, so AtAbort_Twophase will simply delete it from shmem. 2) Not all on-commit/on-rollback callbacks are completed, so (for example) transaction locks may remain. Thus, if the COMMIT/ROLLBACK PREPARED command faces an ERROR, the user 1) will not see the transaction in pg_prepared_xacts, but 2) will see its locks in pg_locks. IIUC, the only way to release these locks is to restart the server. I don't see any discussion about it on hackers, and I also don't see any sane fix for this problem. So, I decided to share it with you. Looking forward to your comments. Please, see the attached test that demonstrates the problem. Actually, reproduction is difficult. The only way to get an ERROR during callbacks execution is to trigger OOM. Since it is hard to implement from perl tests, I have added an injection point. (The patch can be applied to the newest master). -- Best regards, Daniil Davydov
From 2cbbe8ccde03b46971b81e2f3f0affa99bded0d1 Mon Sep 17 00:00:00 2001 From: Daniil Davidov <[email protected]> Date: Sat, 23 May 2026 20:27:30 +0700 Subject: [PATCH] Reproduce error with twophase commit --- src/backend/access/transam/twophase.c | 2 + src/test/recovery/meson.build | 1 + .../recovery/t/053_twophase_commit_error.pl | 76 +++++++++++++++++++ 3 files changed, 79 insertions(+) create mode 100644 src/test/recovery/t/053_twophase_commit_error.pl diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 1035e8b3fc7..07a628e2a78 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1652,6 +1652,8 @@ FinishPreparedTransaction(const char *gid, bool isCommit) */ LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE); + INJECTION_POINT("twophase-callbacks-execution", NULL); + /* And now do the callbacks */ if (isCommit) ProcessRecords(bufptr, fxid, twophase_postcommit_callbacks); diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build index 36d789720a3..6305ce88b67 100644 --- a/src/test/recovery/meson.build +++ b/src/test/recovery/meson.build @@ -61,6 +61,7 @@ tests += { 't/050_redo_segment_missing.pl', 't/051_effective_wal_level.pl', 't/052_checkpoint_segment_missing.pl', + 't/053_twophase_commit_error.pl', ], }, } diff --git a/src/test/recovery/t/053_twophase_commit_error.pl b/src/test/recovery/t/053_twophase_commit_error.pl new file mode 100644 index 00000000000..96006c6dd2d --- /dev/null +++ b/src/test/recovery/t/053_twophase_commit_error.pl @@ -0,0 +1,76 @@ +# Copyright (c) 2026, PostgreSQL Global Development Group + +# Test failure when COMMIT PREPARED gets an ERROR after its commit WAL record +# has been emitted, but before two-phase cleanup callbacks have released locks. + +use strict; +use warnings FATAL => 'all'; + +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + +if ($ENV{enable_injection_points} ne 'yes') +{ + plan skip_all => 'Injection points not supported by this build'; +} + +my $node = PostgreSQL::Test::Cluster->new('node'); +$node->init; +$node->append_conf('postgresql.conf', 'max_prepared_transactions = 5'); +$node->start; + +if (!$node->check_extension('injection_points')) +{ + plan skip_all => 'Extension injection_points not installed'; +} + +$node->safe_psql('postgres', 'CREATE EXTENSION injection_points'); +$node->safe_psql( + 'postgres', q{ + SELECT injection_points_attach('twophase-callbacks-execution', 'error') +}); + +$node->safe_psql( + 'postgres', q{ + CREATE TABLE twophase_lock_target(id int); + CREATE TABLE twophase_commit_marker(id int); + + BEGIN; + LOCK TABLE twophase_lock_target IN ACCESS EXCLUSIVE MODE; + INSERT INTO twophase_commit_marker VALUES (1); + PREPARE TRANSACTION 'prepared'; +}); + +my $table_oid = $node->safe_psql( + 'postgres', q{ + SELECT oid FROM pg_class WHERE relname = 'twophase_lock_target'; +}); + + +my ($ret, $stdout, $stderr) = + $node->psql('postgres', q{COMMIT PREPARED 'prepared'}); + +isnt($ret, 0, 'COMMIT PREPARED failed at injectcion point'); + +$ret = $node->safe_psql( + 'postgres', q{ + SELECT count(*) FROM pg_prepared_xacts; +}); +is($ret, 0, 'we cannot see transaction in the catalog'); + +$ret = $node->safe_psql( + 'postgres', q{ + SELECT * FROM twophase_commit_marker; +}); +is($ret, 1, 'transaction has been commited'); + + +$ret = $node->safe_psql( + 'postgres', qq{ + SELECT COUNT(*) FROM pg_locks WHERE locktype = 'relation' AND relation = $table_oid; +}); +is($ret, 1, 'orphaned lock detected'); + +$node->stop; +done_testing(); -- 2.43.0
