On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2...@gmail.com> wrote:
>
> ======
> src/backend/replication/slot.c
>
> ReportSlotInvalidation:
>
> 1.
> +
> + case RS_INVAL_IDLE_TIMEOUT:
> + Assert(inactive_since > 0);
> + /* translator: second %s is a GUC variable name */
> + appendStringInfo(&err_detail,
> + _("The slot has remained idle since %s, which is longer than the
> configured \"%s\" duration."),
> + timestamptz_to_str(inactive_since),
> + "idle_replication_slot_timeout");
> + break;
> +
>
> errdetail:
>
> I guess it is no fault of this patch because I see you've only copied
> nearby code, but AFAICT this function is still having an each-way bet
> by using a mixture of _() macro which is for strings intended be
> translated, but then only using them in errdetail_internal() which is
> for strings that are NOT intended to be translated. Isn't it
> contradictory? Why don't we use errdetail() here?
>

Your question is valid and I don't have an answer. I encourage you to
start a new thread to clarify this.

> errhint:
>
> Also, the way the 'hint' is implemented can only be meaningful for
> RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
> always strange, but now that this patch has added another kind of
> switch (cause) this hint implementation now looks increasingly hacky
> to me; it is also inflexible -- e.g. if you ever wanted to add
> different hints. A neater implementation would be to make the code
> more like how the err_detail is handled, so then the errhint string
> would only be assigned within the "case RS_INVAL_WAL_REMOVED:"
>

This makes sense to me.

+
+ case RS_INVAL_IDLE_TIMEOUT:
+ Assert(inactive_since > 0);
+ /* translator: second %s is a GUC variable name */
+ appendStringInfo(&err_detail,
+ _("The slot has remained idle since %s, which is longer than the
configured \"%s\" duration."),
+ timestamptz_to_str(inactive_since),
+ "idle_replication_slot_timeout");

I think the above message should be constructed on a model similar to
the following nearby message:"The slot's restart_lsn %X/%X exceeds the
limit by %llu bytes.". So, how about the following: "The slot's idle
time %s exceeds the configured \"%s\" duration"?

Also, similar to max_slot_wal_keep_size, we should give a hint in this
case to increase idle_replication_slot_timeout.

It is not clear why the injection point test is doing
pg_sync_replication_slots() etc. in the patch. The test should be
simple such that after creating a new physical or logical slot, enable
the injection point, then run the manual checkpoint command, and check
the invalidation status of the slot.

-- 
With Regards,
Amit Kapila.


Reply via email to