On Tue, Feb 2, 2021 at 10:34 AM Ajin Cherian <itsa...@gmail.com> wrote: > > On Mon, Feb 1, 2021 at 11:26 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > I have updated the patch to display WARNING for each of the tablesync > > slots during DropSubscription. As discussed, I have moved the drop > > slot related code towards the end in AlterSubscription_refresh. Apart > > from this, I have fixed one more issue in tablesync code where in > > after catching the exception we were not clearing the transaction > > state on the publisher, see changes in LogicalRepSyncTableStart. I > > have also fixed other comments raised by you. Additionally, I have > > removed the test because it was creating the same name slot as the > > tablesync worker and tablesync worker removed the same due to new > > logic in LogicalRepSyncStart. Earlier, it was not failing because of > > the bug in that code which I have fixed in the attached. > > > > I was testing this patch. I had a table on the subscriber which had a > row that would cause a PK constraint > violation during the table copy. This is resulting in the subscriber > trying to rollback the table copy and failing. >
I am not getting this error. I have tried by below test: Publisher =========== CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl1(somedata, text) VALUES (1, 1); INSERT INTO mytbl1(somedata, text) VALUES (1, 2); COMMIT; CREATE PUBLICATION mypublication FOR TABLE mytbl1; Subscriber ============= CREATE TABLE mytbl1(id SERIAL PRIMARY KEY, somedata int, text varchar(120)); BEGIN; INSERT INTO mytbl1(somedata, text) VALUES (1, 1); INSERT INTO mytbl1(somedata, text) VALUES (1, 2); COMMIT; CREATE SUBSCRIPTION mysub CONNECTION 'host=localhost port=5432 dbname=postgres' PUBLICATION mypublication; It generates the PK violation the first time and then I removed the conflicting rows in the subscriber and it passed. See logs below. 2021-02-02 13:51:34.316 IST [20796] LOG: logical replication table synchronization worker for subscription "mysub", table "mytbl1" has started 2021-02-02 13:52:43.625 IST [20796] ERROR: duplicate key value violates unique constraint "mytbl1_pkey" 2021-02-02 13:52:43.625 IST [20796] DETAIL: Key (id)=(1) already exists. 2021-02-02 13:52:43.625 IST [20796] CONTEXT: COPY mytbl1, line 1 2021-02-02 13:52:43.695 IST [27840] LOG: background worker "logical replication worker" (PID 20796) exited with exit code 1 2021-02-02 13:52:43.884 IST [6260] LOG: logical replication table synchronization worker for subscription "mysub", table "mytbl1" has started 2021-02-02 13:53:54.680 IST [6260] LOG: logical replication table synchronization worker for subscription "mysub", table "mytbl1" has finished Also, a similar test exists in 0004_sync.pl, is that also failing for you? Can you please provide detailed steps that led to this failure? > > And one more thing I see is that now we error out in PG_CATCH() in > LogicalRepSyncTableStart() with the above error and as a result, the > tablesync slot is not dropped. Hence causing the slot create to fail > in the next restart. > I think this can be avoided. We could either attempt a rollback only > on specific failures and drop slot prior to erroring out. > Hmm, we have to first rollback before attempting any other operation because the transaction on the publisher is in an errored state. -- With Regards, Amit Kapila.