On Wed, Jan 15, 2025 at 5:57 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Mon, Jan 13, 2025 at 8:39 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > As of now, I can't think of a way to throttle the publisher when the > > apply_worker lags. Basically, we need some way to throttle (reduce the > > speed of backends) when the apply worker is lagging behind a threshold > > margin. Can you think of some way? I thought if one notices frequent > > invalidation of the launcher's slot due to max_lag, then they can > > rebalance their workload on the publisher. > > I don't have any ideas other than invalidating the launcher's slot > when the apply lag is huge. We can think of invalidating the > launcher's slot for some reasons such as the replay lag, the age of > slot's xmin, and the duration. >
Right, this is exactly where we are heading. I think we can add reasons step-wise. For example, as a first step, we can invalidate the slot due to replay LAG. Then, slowly, we can add other reasons as well. One thing that needs more discussion is the exact way to invalidate a slot. I have mentioned a couple of ideas in my previous email which I am writing again: "If we just invalidate the slot, users can check the status of the slot and need to disable/enable retain_conflict_info again to start retaining the required information. This would be required because we can't allow system slots (slots created internally) to be created by users. The other way could be that instead of invalidating the slot, we directly drop/re-create the slot or increase its xmin. If we choose to advance the slot automatically without user intervention, we need to let users know via LOG and or via information in the view." > > > > > > > The max_lag idea sounds interesting for the case > > > where the subscriber is much behind. Probably we can visit this idea > > > as a new feature after completing this feature. > > > > > > > Sure, but what will be our answer to users for cases where the > > performance tanks due to bloat accumulation? The tests show that once > > the apply_lag becomes large, it becomes almost impossible for the > > apply worker to catch up (or take a very long time) and advance the > > slot's xmin. The users can disable retain_conflict_info to bring back > > the performance and get rid of bloat but I thought it would be easier > > for users to do that if we have some knob where they don't need to > > wait till actually the problem of bloat/performance dip happens. > > Probably retaining dead tuples based on the time duration or its age > might be other solutions, it would increase a risk of not being able > to detect update_deleted conflict though. I think in any way as long > as we accumulate dead tulpes to detect update_deleted conflicts, it > would be a tradeoff between reliably detecting update_deleted > conflicts and the performance. > Right, and users have an option for it. Say, if they set max_lag as -1 (or some special value), we won't invalidate the slot, so the update_delete conflict can be detected with complete reliability. At this stage, it is okay if this information is LOGGED and displayed via a system view. We need more thoughts while working on the CONFLICT RESOLUTION patch such as we may need to additionally display a WARNING or ERROR if the remote_tuples commit_time is earlier than the last time slot is invalidated. I don't want to go in a detailed discussion at this point but just wanted you to know that we will need additional work for the resolution of update_delete conflicts to avoid inconsistency. > As for detecting update_deleted conflicts, we probably don't need the > whole tuple data of deleted tuples. It would be sufficient if we can > check XIDs of deleted tuple to get their origins and commit > timestamps. Probably the same is true for the old version of updated > tuple in terms of detecting update_origin_differ conflicts. If my > understanding is right, probably we can remove only the tuple data of > dead tuples that are older than a xmin horizon (excluding the > launcher's xmin), while leaving the heap tuple header, which can > minimize the table bloat. > I am afraid that is not possible because even to detect the conflict, we first need to find the matching tuple on the subscriber node. If the replica_indentity or primary_key is present in the table, we try to save that and transaction info but that won't be simple either. Also, if RI or primary_key is not there, we need an entire tuple to match. We need a concept of tombstone tables (or we can call it a dead-rows-store) where old data is stored reliably till we don't need it. We have discussed briefly that idea previously [1][2] and decided to move forward with an idea to retain the dead tuples idea based on the theory that we already use similar ideas at other places. BTW, a related point to note is that we need to retain the conflict_info even to detect origin_differ conflict with complete reliability. We need only commit_ts information for that case. See analysis [3]. [1] - https://www.postgresql.org/message-id/CAJpy0uCov4JfZJeOvY0O21_gk9bcgNUDp4jf8%2BBbMp%2BEAv8cVQ%40mail.gmail.com [2] - https://www.postgresql.org/message-id/e4cdb849-d647-4acf-aabe-7049ae170fbf%40enterprisedb.com [3] - https://www.postgresql.org/message-id/OSCPR01MB14966F6B816880165E387758AF5112%40OSCPR01MB14966.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.