On Mon, Jul 7, 2025 at 3:31 AM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Mon, Jul 7, 2025 at 10:13 AM Zhijie Hou (Fujitsu) wrote: > > > > On Sun, Jul 6, 2025 at 10:51 PM Masahiko Sawada wrote: > > > > > > On Fri, Jul 4, 2025 at 8:18 PM Zhijie Hou (Fujitsu) > > > <houzj.f...@fujitsu.com> > > > wrote: > > > > > > > > On Wed, Jul 2, 2025 at 3:28 PM Hou, Zhijie wrote: > > > > > Kindly use the latest patch set for performance testing. > > > > > > > > During testing, we observed a limitation in cascading logical > > > > replication setups, such as (A -> B -> C). When retain_conflict_info > > > > is enabled on Node C, it may not retain information necessary for > > > > conflict detection when applying changes originally replicated from > > > > Node A. This happens because Node C only waits for locally > > > > originated changes on Node B to be applied before advancing the > > > > non-removable > > > transaction ID. > > > > > > > > For example, Consider a logical replication setup as mentioned above > > > > : A -> B > > > -> C. > > > > - All three nodes have a table t1 with two tuples (1,1) (2,2). > > > > - Node B subscribed to all changes of t1 from Node A > > > > - Node-C subscribed to all changes from Node B. > > > > - Subscriptions use the default origin=ANY, as this is not a > > > > bidirectional > > > > setup. > > > > > > > > Now, consider two concurrent operations: > > > > - @9:00 Node A - UPDATE (1,1) -> (1,11) > > > > > > > > - @9:02 Node C - DELETE (1,1) > > > > > > > > Assume a slight delay at Node B before it applies the update from Node > > > > A. > > > > > > > > @9:03 Node C - advances the non-removable XID because it sees no > > > > concurrent transactions from Node B. It is unaware of Node A’s > > > > concurrent > > > update. > > > > > > > > @9:04 Node B - receives Node A's UPDATE and applies (1,1) -> (1,11) > > > > t1 has tuples : (1,11), (2,2) > > > > > > > > @9:05 Node C - receives the UPDATE (1,1) -> (1,11) > > > > - As conflict slot’s xmin is advanced, the deleted tuple may > > > > already > > > have > > > > been removed. > > > > - Conflict resolution fails to detect update_deleted and instead > > > > raises > > > > update_missing. > > > > > > > > Note that, as per decoding logic Node C sees the commit timestamp of > > > > the update as 9:00 (origin commit_ts from Node A), not 9:04 (commit > > > > time on Node B). In this case, since the UPDATE's timestamp is > > > > earlier than the DELETE, Node C should ideally detect an > > > > update_deleted conflict. However, it cannot, because it no longer > > > > retains > > the deleted tuple. > > > > > > > > Even if Node C attempts to retrieve the latest WAL position from > > > > Node A, Node C doesn't maintain any LSN which we could use to compare > > with it. > > > > > > > > This scenario is similar to another restriction in the patch where > > > > retain_conflict_info is not supported if the publisher is also a > > > > physical standby, as the required transaction information from the > > > > original primary is unavailable. Moreover, this limitation is > > > > relevant only when the subscription origin option is set to ANY, as > > > > only in that case changes from other origins can be replicated. > > > > Since retain_conflict_info is primarily useful for conflict > > > > detection in bidirectional clusters where the origin option is set > > > > to NONE, this limitation > > > appears acceptable. > > > > > > > > Given these findings, to help users avoid unintended configurations, > > > > we plan to issue a warning in scenarios where replicated changes may > > > > include origins other than the direct publisher, similar to the > > > > existing checks in the > > > > check_publications_origin() function. > > > > > > > > Here is the latest patch that implements the warning and documents > > > > this case. Only 0001 is modified for this. > > > > > > > > A big thanks to Nisha for invaluable assistance in identifying this > > > > case and preparing the analysis for it. > > > > > > I'm still reviewing the 0001 patch but let me share some comments and > > > questions I have so far: > > > > Thanks for the comments! > > > > > > > > --- > > > +/* > > > + * Determine the minimum non-removable transaction ID across all > > > +apply workers > > > + * for subscriptions that have retain_conflict_info enabled. Store > > > +the result > > > + * in *xmin. > > > + * > > > + * If the replication slot cannot be advanced during this cycle, due > > > +to either > > > + * a disabled subscription or an inactive worker, set > > > +*can_advance_xmin to > > > + * false. > > > + */ > > > +static void > > > +compute_min_nonremovable_xid(LogicalRepWorker *worker, > > > + bool retain_conflict_info, TransactionId > > *xmin, > > > + bool *can_advance_xmin) > > > > > > I think this function is quite confusing for several reasons. For > > > instance, it's doing more things than described in the comments such > > > as trying to create the CONFLICT_DETECTION_SLOT if no worker is > > > passed. Also, one of the caller > > > describes: > > > > > > + /* > > > + * This is required to ensure that we don't advance the > > > xmin > > > + * of CONFLICT_DETECTION_SLOT even if one of the > > > subscriptions > > > + * is not enabled. Otherwise, we won't be able to detect > > > + * conflicts reliably for such a subscription even though > > > it > > > + * has set the retain_conflict_info option. > > > + */ > > > + compute_min_nonremovable_xid(NULL, > > > sub->retainconflictinfo, > > > + &xmin, > > > + &can_advance_xmin); > > > > > > but it's unclear to me from the function name that it tries to create > > > the replication slot. Furthermore, in this path it doesn't actually > > > compute xmin. I guess we can try to create CONFLICT_DETECTION_SLOT in > > > the loop of "foreach(lc, sublist)" and set false to can_advance_xmin > > > if either the subscription is disabled or the worker is not running. > > > > I understand. The original code was similar to your suggestion, but we > > decided > > to encapsulate it within a separate function to maintain a clean and concise > > main loop. However, your suggestion also makes sense, so I will proceed with > > the change. > > I have made this change in the 0002 patch for reference. What do you think ? > If > there are no objections, I plan to merge it in the next version. >
The changes in the 0002 patch look good to me. I've attached the patch for some minor suggestions. Please incorporate these changes if you agree. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
change_masahiko_v49.patch
Description: Binary data