While reviewing/testing subscriber-side work for $SUBJECT [1], I noticed a problem that seems to need a broader discussion, so started this thread. We can get prepare for the same GID more than once for the cases where we have defined multiple subscriptions for publications on the same server and prepared transaction has operations on tables subscribed to those subscriptions. For such cases, one of the prepare will be successful and others will fail in which case the server will send them again. Once the commit prepared is done for the first one, the next prepare will be successful. Now, this is not ideal but will work.
However, if the user has setup synchronous_standby_names for all the subscriptions then we won't be able to proceed because the prepare on publisher will wait for all the subscriptions to ack and the subscriptions are waiting for the first prepare to finish. See an example below for such a situation. I think this can also happen if we get any key violation while applying the changes on the subscriber, but for that, we can ask the user to remove the violating key on the subscriber as that is what we suggest now also for commits. Similarly, say the user has already prepared the transaction with the same GID on subscriber-node, then also we can get into a similar situation but for that, we can ask the user to commit such a GID. We can think of appending some unique identifier (like subid) with GID but that won't work for cascaded standby setup (where the prepares on subscriber will be again sent to another subscriber) as the GID can become too long. So that might not be a good solution, maybe we can optimize it in some way that we append only when there is a GID clash. The other thing we could do is to ask the user to temporarily disable the subscription and change synchronous_standby_settings on the publisher node. Any better ideas? Example of the above scenario, you can see this problem after applying the patches at [1]. Publisher ================= CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl(somedata, text) VALUES (1, 1); INSERT INTO mytbl(somedata, text) VALUES (1, 2); COMMIT; CREATE PUBLICATION mypub FOR TABLE mytbl; CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl1(somedata, text) VALUES (1, 1); INSERT INTO mytbl1(somedata, text) VALUES (1, 2); COMMIT; CREATE PUBLICATION mypub1 FOR TABLE mytbl1; Subscriber ============= CREATE TABLE mytbl(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypub WITH(two_phase = on); CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypub1 WITH(two_phase = on); Now, set synchronous_standby_names = 'FIRST 2 (mysub, mysub1)' on the publisher in postgresql.conf and restart both publisher and subscriber, actually restart is not required as synchronous_standby_names is a SIGHUP parameter. Publisher ============= BEGIN; Insert into mytbl values(17,1,18); Insert into mytbl1 values(17,1,18); Prepare Transaction 'foo'; Now, this Prepare transaction will wait forever because on subscriber we are getting "ERROR: transaction identifier "foo" is already in use" which means it is waiting for a publisher to send commit prepared for first apply worker and publisher is waiting for both the subscriptions to send ack. This is happening because the prepared transaction on publisher operates on tables of both subscriptions. In short, on the subscriber, both the apply workers (corresponding to two subscriptions) are getting the same prepare transaction GID, leading to an error on the subscriber and making the publisher wait forever. Thoughts? [1] - https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com -- With Regards, Amit Kapila.