On Tue, Mar 16, 2021 at 1:36 AM vignesh C <vignes...@gmail.com> wrote:
> On Tue, Feb 23, 2021 at 3:59 AM Andres Freund <and...@anarazel.de> wrote: > > > > Hi, > > > > The 2pc decoding added in > > > > commit a271a1b50e9bec07e2ef3a05e38e7285113e4ce6 > > Author: Amit Kapila <akap...@postgresql.org> > > Date: 2021-01-04 08:34:50 +0530 > > > > Allow decoding at prepare time in ReorderBuffer. > > > > has a deadlock danger when used in a way that takes advantage of > > separate decoding of the 2PC PREPARE. > > > > > > I assume the goal of decoding the 2PC PREPARE is so one can wait for the > > PREPARE to have logically replicated, before doing the COMMIT PREPARED. > > > > > > However, currently it's pretty easy to get into a state where logical > > decoding cannot progress until the 2PC transaction has > > committed/aborted. Which essentially would lead to undetected deadlocks. > > > > The problem is that catalog tables accessed during logical decoding need > > to get locked (otherwise e.g. a table rewrite could happen > > concurrently). But if the prepared transaction itself holds a lock on a > > catalog table, logical decoding will block on that lock - which won't be > > released until replication progresses. A deadlock. > > > > A trivial example: > > > > SELECT pg_create_logical_replication_slot('test', 'test_decoding'); > > CREATE TABLE foo(id serial primary key); > > BEGIN; > > LOCK pg_class; > > INSERT INTO foo DEFAULT VALUES; > > PREPARE TRANSACTION 'foo'; > > > > -- hangs waiting for pg_class to be unlocked > > SELECT pg_logical_slot_get_changes('test', NULL, NULL, > 'two-phase-commit', '1'); > > > > > > Now, more realistic versions of this scenario would probably lock a > > 'user catalog table' containing replication metadata instead of > > pg_class, but ... > > > > > > At first this seems to be a significant issue. But on the other hand, if > > you were to shut the cluster down in this situation (or disconnect all > > sessions), you have broken cluster on your hand - without logical > > decoding being involved. As it turns out, we need to read pg_class to > > log in... And I can't remember this being reported to be a problem? > > > > > > Perhaps all that we need to do is to disallow 2PC prepare if [user] > > catalog tables have been locked exclusively? Similar to how we're > > disallowing preparing tables with temp table access. > > > > Even I felt we should not allow prepare a transaction that has locked > system tables, as it does not allow creating a new session after > restart and also causes the deadlock while logical decoding of > prepared transaction. > I have made a patch to make the prepare transaction fail in this > scenario. Attached the patch for the same. > Thoughts? > > The patch applies fine on HEAD and "make check" passes fine. No major comments on the patch, just a minor comment: If you could change the error from, " cannot PREPARE a transaction that has a lock on user catalog/system table(s)" to "cannot PREPARE a transaction that has an *exclusive lock* on user catalog/system table(s)" that would be a more accurate instruction to the user. regards, Ajin Cherian Fujitsu Australia