On Fri, Apr 25, 2025 at 4:06 PM shveta malik <shveta.ma...@gmail.com> wrote:
>
> >
> > Here is V30 patch set includes the following changes:
> >
>
> Thank You for the patch, please find few concerns:
>
> 1)
> By looking at code of ApplyLauncherMain(), it appears that we stopped
> advancing shared-slot's xmin if any of the subscriptions with
> retain_conflict_info is disabled. If a subscription is not being used
> and is disabled forever (or for quite long), that means xmin will
> never be advanced and we will keep accumulating dead-rows even if
> other subscribers with retain_conflict_info are actively setting their
> oldest_xmin. This could be problematic. Here too, there should be some
> way to set stop-conflict-rettention for such a subscription like we do
> for 'subscriber not able to catch-up case'.
> But I understand it can be complex to implement as we do not know for
> how long a subscription is disabled. If we do not find a simpler way
> to implement it, then at least we can document  such cases and the
> risks associated with disabled subscription which has
> 'retain_conflict_info' enabled. Thoughts?
>

Yeah, I agree that this is the case to be worried about but OTOH, in
such cases, even the corresponding logical_slot on the primary won't
be advanced and lead to bloat on the primary as well. So, we can
probably give WARNING when user tries to disable a subscription or
create a disabled subscription with retention flag true. We may need
to  write a LOG or raise a WARNING when while exiting apply worker
disables the subscription due disable_on_error option. In addition to
that, we can even document this case. Will that address your concern?

Few minor comments on 0001:
1.
+ if (TimestampDifferenceExceeds(data->reply_time,
+    data->candidate_xid_time, 0))
+ ereport(ERROR,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("oldest_nonremovable_xid transaction ID may be advanced prematurely"),

Shouldn't this be elog as this is an internal message? And, instead of
"... may be ..", shall we use ".. could be .." in the error message as
the oldest_nonremovable_xid is not yet advanced by this time.

2.
+ * It's necessary to use FullTransactionId here to mitigate potential race
+ * conditions. Such scenarios might occur if the replication slot is not
+ * yet created by the launcher while the apply worker has already
+ * initialized this field.

IIRC, we discussed why it isn't easy to close this race condition. Can
we capture that in comments separately?

-- 
With Regards,
Amit Kapila.


Reply via email to