On Fri, Jan 31, 2025 at 2:32 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2...@gmail.com> wrote: > > > > ====== > > src/backend/replication/slot.c > > > > ReportSlotInvalidation: > > > > 1. > > + > > + case RS_INVAL_IDLE_TIMEOUT: > > + Assert(inactive_since > 0); > > + /* translator: second %s is a GUC variable name */ > > + appendStringInfo(&err_detail, > > + _("The slot has remained idle since %s, which is longer than the > > configured \"%s\" duration."), > > + timestamptz_to_str(inactive_since), > > + "idle_replication_slot_timeout"); > > + break; > > + > > > > errdetail: > > > > I guess it is no fault of this patch because I see you've only copied > > nearby code, but AFAICT this function is still having an each-way bet > > by using a mixture of _() macro which is for strings intended be > > translated, but then only using them in errdetail_internal() which is > > for strings that are NOT intended to be translated. Isn't it > > contradictory? Why don't we use errdetail() here? > > > > Your question is valid and I don't have an answer. I encourage you to > start a new thread to clarify this. > > > errhint: > > > > Also, the way the 'hint' is implemented can only be meaningful for > > RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was > > always strange, but now that this patch has added another kind of > > switch (cause) this hint implementation now looks increasingly hacky > > to me; it is also inflexible -- e.g. if you ever wanted to add > > different hints. A neater implementation would be to make the code > > more like how the err_detail is handled, so then the errhint string > > would only be assigned within the "case RS_INVAL_WAL_REMOVED:" > > > > This makes sense to me. > > + > + case RS_INVAL_IDLE_TIMEOUT: > + Assert(inactive_since > 0); > + /* translator: second %s is a GUC variable name */ > + appendStringInfo(&err_detail, > + _("The slot has remained idle since %s, which is longer than the > configured \"%s\" duration."), > + timestamptz_to_str(inactive_since), > + "idle_replication_slot_timeout"); > > I think the above message should be constructed on a model similar to > the following nearby message:"The slot's restart_lsn %X/%X exceeds the > limit by %llu bytes.". So, how about the following: "The slot's idle > time %s exceeds the configured \"%s\" duration"? > > Also, similar to max_slot_wal_keep_size, we should give a hint in this > case to increase idle_replication_slot_timeout. > > It is not clear why the injection point test is doing > pg_sync_replication_slots() etc. in the patch. The test should be > simple such that after creating a new physical or logical slot, enable > the injection point, then run the manual checkpoint command, and check > the invalidation status of the slot. >
Thanks for the review! I have incorporated the above comments. The test in patch-002 has been optimized as suggested and now completes in less than a second. Please find the attached v66 patch set. The base patch(v65-001) is committed now, so I have rebased the patches. Thank you, Kuroda-san, for working on patch-002. -- Thanks, Nisha
v66-0001-Introduce-inactive_timeout-based-replication-slo.patch
Description: Binary data
v66-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patch
Description: Binary data
v66-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patch
Description: Binary data