On Fri, May 16, 2025 at 11:15 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Apr 25, 2025 at 4:06 PM shveta malik <shveta.ma...@gmail.com> wrote: > > > > 2) > > in wait_for_local_flush(), we have > > should_stop_conflict_info_retention() before 'AllTablesyncsReady' > > check. Should we give a discount for table-sync time and avoid doing > > stop-conflict-retention when table-sync is going on? This is because > > table-sync is one time operation (or done only on > > subscription-refresh), so we shall not count time spent in table-sync > > for 'max_conflict_retention_duration'. We can reset our timer if > > table-sync is observed to be going on. Thoughts? > > > > Sounds reasonable to me. > > > > > 3) > > In get_candidate_xid(), we first set candidate_xid_time and later > > candidate_xid. And between these 2 there are chances that we return > > without updating candidate_xid. See 'Return if the > > oldest_nonremovable_xid cannot be advanced ' comment. That will leave > > 'candidate_xid_time' set to new value while 'candidate_xid' is not > > yet set. > > > > Good point. I think we should set 'candidate_xid_time' along with > candidate_xid (just after setting candidate_xid). > > > 4) > > Do you think there should be some relation between > > 'xid_advance_interval' and 'max_conflict_retention_duration'? Should > > max of 'xid_advance_interval' be limited by > > 'max_conflict_retention_duration'. Currently say > > xid_advance_interval' is set to max 3 mins, what if > > 'max_conflict_retention_duration' is set to 2 mins? In that case we > > will not even check for new xids before 3 mins are over, while > > 'max_conflict_retention_duration' sets a limit of 2 mins for dead > > tuples retention. > > > > Right, ideally, the 'xid_advance_interval' should be set to a value > less than 'max_conflict_retention_duration' when no new_xid is found. > > BTW, another related point is that when we decide to stop retaining > dead tuples (via should_stop_conflict_info_retention), should we also > consider the case that the apply worker didn't even try to get the > publisher status because previously it decided that > oldest_nonremovable_xid cannot be advanced due to its > OldestActiveTransactionId? >
Do you mean avoid stop-conflict-retention in such a case as apply worker itself did not request status from the publisher? If I understood your point correctly, then we can do that by advancing the timer to a new value even if we did not update candidate-xid and did not ask the status from the publisher. I think it is already happening in get_candidate_xid(). It updates the timer but not the xid (my concern #3 can be ignored then). thanks Shveta