Re: pg_upgrade and logical replication

Amit Kapila Fri, 17 Feb 2023 02:43:25 -0800

On Fri, Feb 17, 2023 at 1:24 PM Julien Rouhaud <rjuju...@gmail.com> wrote:
>
> I was working on testing a major upgrade scenario using a mix of physical and
> logical replication when I faced some unexpected problem leading to missing
> rows.  Note that my motivation is to rely on physical replication / physical
> backup to avoid recreating a node from scratch using logical replication, as
> the initial sync with logical replication is much more costly and impacting
> compared to pg_basebackup / restoring a physical backup, but the same problem
> exist if you just pg_upgrade a node that has subscriptions.
>
> The problem is that pg_upgrade creates the subscriptions on the newly upgraded
> node using "WITH (connect = false)", which seems expected as you obviously
> don't want to try to connect to the publisher at that point.  But then once 
> the
> newly upgraded node is restarted and ready to replace the previous one, unless
> I'm missing something there's absolutely no possibility to use the created
> subscriptions without losing some data from the publisher.
>
> The reason is that the subscription doesn't have a local list of relation to
> process until you refresh the subscription, but you can't refresh the
> subscription without enabling it (and you can't enable it in a transaction),
> which means that you have to let the logical worker start, consume and ignore
> all changes that happened on the publisher side until the refresh happens.
>
> An easy workaround that I tried is to allow something like
>
> ALTER SUBSCRIPTION ...  ENABLE WITH (refresh = true, copy_data = false)
>
> so that the refresh internally happens before the apply worker is started and
> you just keep consuming the delta, which works on naive scenario.
>
> One concern I have with this approach is that the default values for both
> "refresh" and "copy_data" for all other subcommands is "true, but we would
> probably need a different default value in that exact scenario (as we know we
> already have the data).  I think that it would otherwise be safe in my very
> specific scenario, assuming that you created the slot beforehand and moved the
> slot's LSN at the promotion point, as even if you add non-empty tables to the
> publication you will only need the delta whether those were initially empty or
> not given your initial physical replica state.
>


This point is not very clear. Why would one just need delta even for new tables?

>  Any other scenario would make
> this new option dangerous, if not entirely useless, but not more than any of
> the current commands that lead to refreshing a subscription and have the same
> options I guess.
>
> All in all, currently the only way to somewhat safely resume logical
> replication after a pg_upgrade is to drop all the subscriptions that were
> transferred during pg_upgrade on all databases and recreate them (using the
> existing slots on the publisher side obviously), allowing the initial
> connection.  But this approach only works in the exact scenario I mentioned
> (physical to logical replication, or at least a case where *all* the tables
> where logically replicated prior to the pg_ugprade), otherwise you have to
> recreate the follower node from scratch using logical repication.
>

I think if you dropped and recreated the subscriptions by retaining
old slots, the replication should resume from where it left off before
the upgrade. Which scenario are you concerned about?

> Is that indeed the current behavior, or did I miss something?
>
> Is this "resume logical replication on pg_upgraded node" something we want to
> support better?  I was thinking that we could add a new pg_dump mode (maybe
> only usable during pg_upgrade) that also restores the pg_subscription_rel
> content in each subscription or something like that.  If not, should 
> pg_upgrade
> keep preserving the subscriptions as it doesn't seem safe to use them, or at
> least document the hazards (I didn't find anything about it in the
> documentation)?
>
>

There is a mention of this in pg_dump docs. See [1] (When dumping
logical replication subscriptions ...)

[1] - https://www.postgresql.org/docs/devel/app-pgdump.html

-- 
With Regards,
Amit Kapila.

Re: pg_upgrade and logical replication

Reply via email to