On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > I've attached a rebased patch. Thank you for your rebase !
Several review comments on v8. (1) doc/src/sgml/logical-replication.sgml + + <para> + To resolve conflicts, you need to consider changing the data on the subscriber so + that it doesn't conflict with incoming changes, or dropping the conflicting constraint + or unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes, or as a last resort, by skipping the whole transaction. + Skipping the whole transaction includes skipping changes that may not violate + any constraint. This can easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin. + </para> The first sentence is too long and lack of readability slightly. One idea to sort out listing items is to utilize "itemizedlist". For instance, I imagined something like below. <para> To resolve conflicts, you need to consider following actions: <itemizedlist> <listitem> <para> Change the data on the subscriber so that it doesn't conflict with incoming changes </para> </listitem> ... <listitem> <para> As a last resort, skip the whole transaction </para> </listitem> </itemizedlist> .... </para> What did you think ? By the way, in case only when you want to keep the current sentence style, I have one more question. Do we need "by" in the part "by skipping the whole transaction" ? If we focus on only this action, I think the sentence becomes "you need to consider skipping the whole transaction". If this is true, we don't need "by" in the part. (2) Also, in the same paragraph, we write + ... This can easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin. The subject of this sentence should be "Those" or "Some of those" ? because we want to mention either "new skip xid feature" or "pg_replication_origin_advance". (3) doc/src/sgml/ref/alter_subscription.sgml Below change contains unnecessary spaces. + the whole transaction. Using <command> ALTER SUBSCRIPTION ... SKIP </command> Need to change From: <command> ALTER SUBSCRIPTION ... SKIP </command> To: <command>ALTER SUBSCRIPTION ... SKIP</command> (4) comment in clear_subscription_skip_xid + * the flush position the transaction will be sent again and the user + * needs to be set subskipxid again. We can reduce the possibility by Shoud change From: the user needs to be set... To: the user needs to set... (5) clear_subscription_skip_xid + if (!HeapTupleIsValid(tup)) + elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name); Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ? This suggestion has another aspect that in within one patch, we don't mix both ereport and elog at the same time. (6) apply_handle_stream_abort @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s) logicalrep_read_stream_abort(s, &xid, &subxid); + /* + * We don't expect the user to set the XID of the transaction that is + * rolled back but if the skip XID is set, clear it. + */ + if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid) + clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0); + In my humble opinion, this still cares about subtransaction xid still. If we want to be consistent with top level transactions only, I felt checking MySubscription->skipxid == xid should be sufficient. Below is an *insame* (in a sense not correct usage) scenario to hit the "MySubscription->skipxid == subxid". Sorry if it is not perfect. ------- Set logical_decoding_work_mem = 64. Create tables named 'tab' with a column id (integer); Create pub and sub with streaming = true. No initial data is required on both nodes because we just want to issue stream_abort after executing skip xid feature. <Session1> to the publisher begin; select pg_current_xact_id(); -- for reference insert into tab values (1); savepoint s1; insert into tab values (2); savepoint s2; insert into tab values (generate_series(1001, 2000)); select ctid, xmin, xmax, id from tab where id in (1, 2, 1001); <Session2> to the subscriber select subname, subskipxid from pg_subscription; -- shows 0 alter subscription mysub skip (xid = xxx); -- xxx is that of xmin for 1001 on the publisher select subname, subskipxid from pg_subscription; -- check it shows xxx just in case <Session1> rollback to s1; commit; select * from tab; -- shows only data '1'. <Session2> select subname, subskipxid from pg_subscription; -- shows 0. subskipxid was reset by the skip xid feature select count(1) = 1 from tab; -- shows true FYI: the commands result of those last two commands. postgres=# select subname, subskipxid from pg_subscription; subname | subskipxid ---------+------------ mysub | 0 (1 row) postgres=# select count(1) = 1 from tab; ?column? ---------- t (1 row) Thus, it still cares about subtransactions and clear the subskipxid. Should we fix this behavior for consistency ? Best Regards, Takamichi Osumi