> On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.m...@gmail.com> wrote:
>
> I've attached rebased patches.
Here are some review comments:
For the v12-0002 patch:
The documentation changes for ALTER SUBSCRIPTION .. RESET look strange to me.
You grouped SET and RESET together, much like sql-altertable.html has them
grouped, but I don't think it flows naturally here, as the two commands do not
support the same set of parameters. It might look better if you documented
these separately. It might also be good to order the parameters the same, so
that the differences can more quickly be seen.
For the v12-0003 patch:
I believe this feature is needed, but it also seems like a very powerful
foot-gun. Can we do anything to make it less likely that users will hurt
themselves with this tool?
I am thinking back to support calls I have attended. When a production system
is down, there is often some hesitancy to perform ad-hoc operations on the
database, but once the decision has been made to do so, people try to get the
whole process done as quickly as possible. If multiple transactions on the
publisher fail on the subscriber, they will do so in series, not in parallel.
The process of clearing these errors will amount to copying the xid of each
failed transaction to the ALTER SUBSCRIPTION ... SET (skip_xid = xxx) command
and running it, then the next, then the next, .... Perhaps the first couple
times through the process, the customer will look to see that the failure is of
the same type and on the same table, but after a short time they will likely
just script something to clear the rest as quickly as possible. In the heat of
the moment, they may not include a check of the failure message, but merely a
grep of the failing xid.
If the user could instead clear all failed transactions of the same type, that
might make it less likely that they unthinkingly also skip subsequent errors of
some different type. Perhaps something like ALTER SUBSCRIPTION ... SET
(skip_failures = 'duplicate key value violates unique constraint "test_pkey"')?
This is arguably a different feature request, and not something your patch is
required to address, but I wonder how much we should limit people shooting
themselves in the foot? If we built something like this using your skip_xid
feature, rather than instead of your skip_xid feature, would your feature need
to be modified?
The docs could use some rewording, too:
+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved.
In my experience, logical replication doesn't stop, but instead goes into an
infinite loop of retries.
+ The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction.
I'm having trouble thinking of an example conflict where skipping a transaction
would be better than writing a BEFORE INSERT trigger on the conflicting table
which suppresses or redirects conflicting rows somewhere else. Particularly
for larger transactions containing multiple statements, suppressing the
conflicting rows using a trigger would be less messy than skipping the
transaction. I think your patch adds a useful tool to the toolkit, but maybe
we should mention more alternatives in the docs? Something like, "changing the
data on the subscriber so that it doesn't conflict with incoming changes, or
dropping the conflicting constraint or unique index, or writing a trigger on
the subscriber to suppress or redirect conflicting incoming changes, or as a
last resort, by skipping the whole transaction"?
Perhaps I'm reading your phrase "changing the data on the subscriber" too
narrowly. To me, that means running DML (either a DELETE or an UPDATE) on the
existing data in the table where the conflict arises. These other options are
DDL and do not easily come to mind when I read that phrase.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company