Re: Improve pg_sync_replication_slots() to wait for primary to advance

Amit Kapila Wed, 04 Mar 2026 22:56:18 -0800

On Wed, Mar 4, 2026 at 1:06 PM shveta malik <[email protected]> wrote:
>
> On Wed, Mar 4, 2026 at 12:26 PM Zhijie Hou (Fujitsu)
> <[email protected]> wrote:
> >
> > On Tuesday, February 17, 2026 12:16 PM Amit Kapila 
> > <[email protected]> wrote:
> > > On Tue, Feb 17, 2026 at 9:13 AM shveta malik <[email protected]>
> > > wrote:
> > > >
> > > > On Mon, Feb 16, 2026 at 4:35 PM Amit Kapila <[email protected]>
> > > wrote:
> > > > >
> > > > > On Fri, Feb 13, 2026 at 7:54 AM Zhijie Hou (Fujitsu)
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > Thanks for pushing! Here are the remaining patches.
> > > > > >
> > > > >
> > > > > One thing that bothers me about the remaining patch is that it could
> > > > > lead to infinite re-tires in the worst case. For example, in first
> > > > > try, slot-1 is not synced say due to physical replication delays in
> > > > > flushing WALs up to the confirmed_flush_lsn of that slot, then in
> > > > > next (re-)try, the same thing happened for slot-2, then in next
> > > > > (re-)try,
> > > > > slot-3 appears to invalidated on standby but it is valid on primary,
> > > > > and so on. What do you think?
> > > >
> > > > Yes, that is a possibility we cannot rule out. This can also happen
> > > > during the first invocation of the API (even without the new changes)
> > > > when we attempt to create new slots, they may remain in a temporary
> > > > state indefinitely. However, that risk is limited to the initial sync,
> > > > until the slots are persisted, which is somewhat expected behavior.
> > > >
> > >
> > > Right.
> > >
> > > > With the current changes though, the possibility of an indefinite wait
> > > > exists during every run. So the question becomes: what would be more
> > > > desirable for users -- for the API to finish with the risk that a few
> > > > slots are not synced, or for the API to wait longer to ensure that all
> > > > slots are properly synced?
> > > >
> > > > I think that if the primary use case of this API is when a user plans
> > > > to run it before a scheduled failover, then it would be better for the
> > > > API to wait and ensure everything is properly synced.
> > > >
> > >
> > > I don't think we can guarantee that all slots are synced as per latest 
> > > primary
> > > state in one invocation because some newly created slots can anyway be
> > > missed. So why take the risk of infinite waits in the API? I think it may 
> > > be
> > > better to extend the usage of this API (probably with more parameters) 
> > > based
> > > on more user feedback.
> > >
> >
> > I agree that we could wait for more user feedback before extending the
> > retry logic to persisted slots.
> >
> > > > > Independent of whether we consider the entire patch, the following
> > > > > bit in the patch in useful as we retry to sync the slots via API.
> > > > > @@ -218,7 +219,7 @@ update_local_synced_slot(RemoteSlot
> > > > > *remote_slot, Oid remote_dbid)
> > > > >   * Can get here only if GUC 'synchronized_standby_slots' on the
> > > > >   * primary server was not configured correctly.
> > > > >   */
> > > > > - ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR,
> > > > > + ereport(LOG,
> > > > >   errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > > > >   errmsg("skipping slot synchronization because the received slot 
> > > > > sync"
> > > > >      " LSN %X/%08X for slot \"%s\" is ahead of the standby position
> > > > > %X/%08X",
> > > > >
> > > >
> > > > yes. I agree.
> > > >
> > >
> > > Let's wait for Hou-San's opinion on this one.
> >
> > +1 for changing this.
> >
> > Here is the patch set to convert elevel to LOG so that the function 
> > cyclically
> > retry until the standby catches up and the slot is successfully persisted.
> >
>
> patch001 has:
>
> -        logical decoding and must be dropped after promotion. See
> +        logical decoding and must be dropped after promotion. This function
> +        retries cyclically until all the failover slots that existed on
> +        primary at the start of the function call are synchronized. See
>
> IIUC, this is not completely true though. If the slot is persisted, we
> do not try cyclically now, we skip the sync. Isn't it?
>


Right, but even if one of the slots is not persisted, it will try to
sync again, no?

 Shall this be
> changed to:
>
> It retries cyclically until all the failover slots that existed on
> primary at the start of the function call are persisted and are
> sync-ready.
>

This is too internal specific. I find Hou-san's version better w.r.t
user facing docs.

 * RS_TEMPORARY. Once the decoding from corresponding LSNs can reach a
  * consistent point, they will be marked as RS_PERSISTENT.
  *
+ * If the WAL prior to the remote slot's confirmed_flush_lsn has not been
+ * flushed on the standby, the slot is marked as RS_TEMPORARY. Once the standby
+ * catches up and flushes that WAL, the slot is promoted to RS_PERSISTENT.

BTW, I don't think we need these additional comments.

-- 
With Regards,
Amit Kapila.

Re: Improve pg_sync_replication_slots() to wait for primary to advance

Reply via email to