On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636...@gmail.com> wrote:
>
> Thanks Shveta for coming on this point again and fixing the link.
> The idea is to check if the slot has same name to try to resynchronize it 
> with the primary.
> ok the check on the failover status for the remote slot is perhaps redundant.
> I'm not sure what impact setting the synced flag to true might have. But if 
> you run an additional switchover, it works fine because the synced flag on 
> the new primary is set to true now.
> If we come back to the idea of the GUC or the API, adding an allow_overwrite 
> parameter to the pg_create_logical_replication_slot function and removing the 
> logical slot when set to true could be a suitable approach.
>
> What is your opinion?
>

If implemented as a GUC, it would address only a specific corner case,
making it less suitable to be added as a GUC.

OTOH, adding it as a slot's property makes more sense. You can start
with introducing a new slot property, allow_overwrite. By default,
this property will be set to false.

a) The function pg_create_logical_replication_slot() can be extended
to accept this parameter.
b) A new API pg_alter_logical_replication_slot() can be introduced, to
modify this property after slot creation if needed.
c) The commands CREATE SUBSCRIPTION and ALTER SUBSCRIPTION are not
needed to include an allow_overwrite parameter.  When CREATE
SUBSCRIPTION creates a slot, it will always set allow_overwrite to
false by default. If users need to change this later, they can use the
new API pg_alter_logical_replication_slot() to update the property.
d) Additionally, pg_alter_logical_replication_slot() can serve as a
generic API to modify other slot properties as well.

This appears to be a reasonable idea with potential use cases beyond
just allowing synchronization post switchover. Thoughts?

~~~

Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
 "Exiting from slot synchronization because a slot with the same name
already exists on the standby."

But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.

thanks
Shveta


Reply via email to