On Mon, Jul 7, 2025 at 3:31 AM Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> On Mon, Jul 7, 2025 at 10:13 AM Zhijie Hou (Fujitsu) wrote:
> >
> > On Sun, Jul 6, 2025 at 10:51 PM Masahiko Sawada wrote:
> > >
> > > On Fri, Jul 4, 2025 at 8:18 PM Zhijie Hou (Fujitsu)
> > > <houzj.f...@fujitsu.com>
> > > wrote:
> > > >
> > > > On Wed, Jul 2, 2025 at 3:28 PM Hou, Zhijie wrote:
> > > > > Kindly use the latest patch set for performance testing.
> > > >
> > > > During testing, we observed a limitation in cascading logical
> > > > replication setups, such as (A -> B -> C). When retain_conflict_info
> > > > is enabled on Node C, it may not retain information necessary for
> > > > conflict detection when applying changes originally replicated from
> > > > Node A. This happens because Node C only waits for locally
> > > > originated changes on Node B to be applied before advancing the
> > > > non-removable
> > > transaction ID.
> > > >
> > > > For example, Consider a logical replication setup as mentioned above
> > > > : A -> B
> > > -> C.
> > > >  - All three nodes have a table t1 with two tuples (1,1) (2,2).
> > > >  - Node B subscribed to all changes of t1 from Node A
> > > >  - Node-C subscribed to all changes from Node B.
> > > >  - Subscriptions use the default origin=ANY, as this is not a 
> > > > bidirectional
> > > >    setup.
> > > >
> > > > Now, consider two concurrent operations:
> > > >   - @9:00 Node A - UPDATE (1,1) -> (1,11)
> > > >
> > > >   - @9:02 Node C - DELETE (1,1)
> > > >
> > > > Assume a slight delay at Node B before it applies the update from Node 
> > > > A.
> > > >
> > > >  @9:03 Node C - advances the non-removable XID because it sees no
> > > > concurrent  transactions from Node B. It is unaware of Node A’s
> > > > concurrent
> > > update.
> > > >
> > > >   @9:04 Node B - receives Node A's UPDATE and applies (1,1) -> (1,11)
> > > >   t1 has tuples : (1,11), (2,2)
> > > >
> > > >   @9:05 Node C - receives the UPDATE (1,1) -> (1,11)
> > > >     - As conflict slot’s xmin is advanced, the deleted tuple may
> > > > already
> > > have
> > > >       been removed.
> > > >     - Conflict resolution fails to detect update_deleted and instead 
> > > > raises
> > > >       update_missing.
> > > >
> > > > Note that, as per decoding logic Node C sees the commit timestamp of
> > > > the update as 9:00 (origin commit_ts from Node A), not 9:04 (commit
> > > > time on Node B). In this case, since the UPDATE's timestamp is
> > > > earlier than the DELETE, Node C should ideally detect an
> > > > update_deleted conflict. However, it cannot, because it no longer 
> > > > retains
> > the deleted tuple.
> > > >
> > > > Even if Node C attempts to retrieve the latest WAL position from
> > > > Node A, Node C doesn't maintain any LSN which we could use to compare
> > with it.
> > > >
> > > > This scenario is similar to another restriction in the patch where
> > > > retain_conflict_info is not supported if the publisher is also a
> > > > physical standby, as the required transaction information from the
> > > > original primary is unavailable. Moreover, this limitation is
> > > > relevant only when the subscription origin option is set to ANY, as
> > > > only in that case changes from other origins can be replicated.
> > > > Since retain_conflict_info is primarily useful for conflict
> > > > detection in bidirectional clusters where the origin option is set
> > > > to NONE, this limitation
> > > appears acceptable.
> > > >
> > > > Given these findings, to help users avoid unintended configurations,
> > > > we plan to issue a warning in scenarios where replicated changes may
> > > > include origins other than the direct publisher, similar to the
> > > > existing checks in the
> > > > check_publications_origin() function.
> > > >
> > > > Here is the latest patch that implements the warning and documents
> > > > this case. Only 0001 is modified for this.
> > > >
> > > > A big thanks to Nisha for invaluable assistance in identifying this
> > > > case and preparing the analysis for it.
> > >
> > > I'm still reviewing the 0001 patch but let me share some comments and
> > > questions I have so far:
> >
> > Thanks for the comments!
> >
> > >
> > > ---
> > > +/*
> > > + * Determine the minimum non-removable transaction ID across all
> > > +apply workers
> > > + * for subscriptions that have retain_conflict_info enabled. Store
> > > +the result
> > > + * in *xmin.
> > > + *
> > > + * If the replication slot cannot be advanced during this cycle, due
> > > +to either
> > > + * a disabled subscription or an inactive worker, set
> > > +*can_advance_xmin to
> > > + * false.
> > > + */
> > > +static void
> > > +compute_min_nonremovable_xid(LogicalRepWorker *worker,
> > > +                            bool retain_conflict_info, TransactionId
> > *xmin,
> > > +                            bool *can_advance_xmin)
> > >
> > > I think this function is quite confusing for several reasons. For
> > > instance, it's doing more things than described in the comments such
> > > as trying to create the CONFLICT_DETECTION_SLOT if no worker is
> > > passed. Also, one of the caller
> > > describes:
> > >
> > > +               /*
> > > +                * This is required to ensure that we don't advance the 
> > > xmin
> > > +                * of CONFLICT_DETECTION_SLOT even if one of the
> > > subscriptions
> > > +                * is not enabled. Otherwise, we won't be able to detect
> > > +                * conflicts reliably for such a subscription even though 
> > > it
> > > +                * has set the retain_conflict_info option.
> > > +                */
> > > +               compute_min_nonremovable_xid(NULL,
> > > sub->retainconflictinfo,
> > > +                                            &xmin,
> > > + &can_advance_xmin);
> > >
> > > but it's unclear to me from the function name that it tries to create
> > > the replication slot. Furthermore, in this path it doesn't actually
> > > compute xmin. I guess we can try to create CONFLICT_DETECTION_SLOT in
> > > the loop of "foreach(lc, sublist)" and set false to can_advance_xmin
> > > if either the subscription is disabled or the worker is not running.
> >
> > I understand. The original code was similar to your suggestion, but we 
> > decided
> > to encapsulate it within a separate function to maintain a clean and concise
> > main loop. However, your suggestion also makes sense, so I will proceed with
> > the change.
>
> I have made this change in the 0002 patch for reference. What do you think ? 
> If
> there are no objections, I plan to merge it in the next version.
>

The changes in the 0002 patch look good to me. I've attached the patch
for some minor suggestions. Please incorporate these changes if you
agree.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment: change_masahiko_v49.patch
Description: Binary data

Reply via email to