Hi Amit,

if I resume your scenario
1. A standby S has a failover slot slot1 synchronized with slot1 on primary
P
2. We promote S
3. On P we drop slot1 and create slot1 again with failover mode (a
subscriber exist on another instance by example)
4. A rewind is performed on P the former primary to rejoin S the former
standby
5. On P slot1 is automatically dropped and recreated to be synchronized

In which context this kind of scenario could happend?

Isn't the goal to find a solution for a switchover which is carried out for
maintenance on a Postgres cluster, the aim is to find a compromise to cover
the most likely scenarios.
Do you think we must come back to the allow_overwrite flag approach or
another solution?

Best Regards,

Fabrice



On Mon, Nov 10, 2025 at 1:10 PM Amit Kapila <[email protected]> wrote:

> On Fri, Oct 31, 2025 at 2:58 PM Alexander Kukushkin <[email protected]>
> wrote:
> >
> > Instead of dropping such slots, what we actually need is a way to safely
> set synced=false->true and continue operating.
> >
> > Operating logical replication setups is already extremely complex and
> error-prone — this is not theoretical, it’s something many of us face daily.
> > So rather than adding more speculative features or workarounds, I think
> we should focus on addressing real operational pain points and the
> inconsistencies in the current design.
> >
> > A slot created on the primary (which later becomes a standby) with
> failover=true has a very clear purpose. The failover flag already indicates
> that purpose; synced shouldn’t override it.
> >
>
> I think this is not as clear as you are saying as compared to WAL. In
> failover cases, we bump the WAL timelines on new primary and also have
> facilities like pg_rewind to ensure that old primary can follow the
> new primary after divergence. For slots, there is no such facility,
> now, there is an argument that for slot's it is sufficient to match
> the name and failover to say that it is okay to overwrite the slot on
> old primary. However, it is not clear whether it is always safe to do
> so, for example, if the old primary ran after divergence for sometime
> and one has re-created the slot with same name and failover property,
> it will no longer be the same slot. Unlike WAL, we don't maintain the
> slot's history, so it is not equally clear that we can overwrite old
> primary's slot's as it is.
>
> --
> With Regards,
> Amit Kapila.
>

Reply via email to