Re: [PoC] pg_upgrade: allow to upgrade publisher node

vignesh C Wed, 19 Apr 2023 04:05:48 -0700

On Fri, 14 Apr 2023 at 16:00, Hayato Kuroda (Fujitsu)
<kuroda.hay...@fujitsu.com> wrote:
>
> Dear Julien,
>
> > Sorry for the delay, I didn't had time to come back to it until this 
> > afternoon.
>
> No issues, everyone is busy:-).
>
> > I don't think that your analysis is correct.  Slots are guaranteed to be
> > stopped after all the normal backends have been stopped, exactly to avoid 
> > such
> > extraneous records.
> >
> > What is happening here is that the slot's confirmed_flush_lsn is properly
> > updated in memory and ends up being the same as the current LSN before the
> > shutdown.  But as it's a logical slot and those records aren't decoded, the
> > slot isn't marked as dirty and therefore isn't saved to disk.  You don't see
> > that behavior when doing a manual checkpoint before (per your script 
> > comment),
> > as in that case the checkpoint also tries to save the slot to disk but then
> > finds a slot that was marked as dirty and therefore saves it.
> >
> > In your script's scenario, when you restart the server the previous slot 
> > data
> > is restored and the confirmed_flush_lsn goes backward, which explains those
> > extraneous records.
>
> So you meant to say that the key point was that some records which are not 
> sent
> to subscriber do not mark slots as dirty, hence the updated confirmed_flush 
> was
> not written into slot file. Is it right? LogicalConfirmReceivedLocation() is 
> called
> by walsender when the process gets reply from worker process, so your analysis
> seems correct.
>
> > It's probably totally harmless to throw away that value for now (and 
> > probably
> > also doesn't lead to crazy amount of work after restart, I really don't know
> > much about the logical slot code), but clearly becomes problematic with your
> > usecase.  One easy way to fix this is to teach the checkpoint code to force
> > saving the logical slots to disk even if they're not marked as dirty during 
> > a
> > shutdown checkpoint, as done in the attached v1 patch (renamed as .txt to 
> > not
> > interfere with the cfbot).  With this patch applied I reliably only see a 
> > final
> > shutdown checkpoint record with your scenario.
> >
> > Now such a change will make shutdown a bit more expensive when using logical
> > replication, even if in 99% of cases you will not need to save the
> > confirmed_flush_lsn value, so I don't know if that's acceptable or not.
>
> In any case we these records must be advanced. IIUC, currently such records 
> are
> read after rebooting but ingored, and this patch just skips them. I have not 
> measured,
> but there is a possibility that is not additional overhead, just a trade-off.
>
> Currently I did not come up with another solution, so I have included your 
> patch.
> Please see 0002.
>
> Additionally, I added a checking functions in 0003
> According to pg_resetwal and other functions, the length of 
> CHECKPOINT_SHUTDOWN
> record seems (SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + 
> sizeof(CheckPoint)).
> Therefore, the function ensures that the difference between current insert 
> position
> and confirmed_flush_lsn is less than (above + page header).


Logical replication slots can be created only if wal_level >= logical,
currently we do not have any check to see if wal_level >= logical if
"--include-logical-replication-slots" option is specified. If
include-logical-replication-slots is specified with pg_upgrade, we
will be creating replication slots after a lot of steps like
performing prechecks, analyzing, freezing, deleting, restoring,
copying, setting related objects and then create replication slot,
where we will be erroring out after a lot of time(Many cases
pg_upgrade takes a lot of hours to perform these operations). I feel
it would be better to add a check in the beginning itself somewhere in
check_new_cluster to see if wal_level is set appropriately in case of
include-logical-replication-slot option to detect and throw an error
early itself.

Regards,
Vignesh

Re: [PoC] pg_upgrade: allow to upgrade publisher node

Reply via email to