Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Amit Kapila Tue, 25 Jul 2023 02:01:34 -0700

Currently, we don't perform $SUBJECT at the time of shutdown of the
server. I think currently it will only have a minor impact that after
restart subscribers will ask to start processing before the
XLOG_CHECKPOINT_SHUTDOWN or maybe after the switchover the old
publisher will have an extra WAL record. However, if we want to
support the upgrade of the publisher node such that the existing slots
are copied/created into a new cluster, we need to ensure that all the
changes generated on the publisher must be sent and applied to the
subscriber. This is a hard requirement because after the upgrade we
reset the WAL and if some of the WAL has not been sent then that will
be lost. Now, even a clean shutdown of the publisher node can't ensure
that all the WAL has been sent because it is quite possible that the
subscriber node is down due to which at shutdown time walsenders won't
be available to send the data. Similarly, there could be some logical
slots created via backend which may not have processed all the data
and we can't copy those slots as it is during the upgrade.


To ensure that all the data has been sent during the upgrade, we can
ensure that each logical slot's confirmed_flush_lsn (position in the
WAL till which subscriber has confirmed that it has applied the WAL)
is the same as current_wal_insert_lsn. Now, because we don't send
XLOG_CHECKPOINT_SHUTDOWN even on clean shutdown, confirmed_flush_lsn
will never be the same as current_wal_insert_lsn. The one idea being
discussed in patch [1] (see 0003) is to ensure that each slot's LSN is
exactly XLOG_CHECKPOINT_SHUTDOWN ago which probably has some drawbacks
like what if we tomorrow add some other WAL in the shutdown checkpoint
path or the size of record changes then we would need to modify the
corresponding code in upgrade.

The other possibility is that we allow logical walsenders to process
XLOG_CHECKPOINT_SHUTDOWN before shutdown after which during the
upgrade confirmed_flush_lsn will be the same as
current_wal_insert_lsn. AFAICU, the primary reason that we don't allow
it is that we want to avoid writing any new WAL after the shutdown
checkpoint (to avoid any sort of PANIC as discussed in the thread [2])
which is possible during decoding due to hint bits but it doesn't seem
decoding of XLOG_CHECKPOINT_SHUTDOWN can lead to any hint bit updates.
It seems we made these changes as part of commit c6c3334364 [3]. Note
that even if we can ensure that walsenders send all the WAL before
shutdown and make corresponding logical slots up-to-date so that there
is no pending data but it would still be possible that logical slots
created manually via backends won't consume all the WAL before
shutdown. I think those will be the responsibility of users as those
are created by them.

We can also provide some guidelines to users similar to what we have
on physical standby in pg_upgrade docs [4] (See: 9 Prepare for standby
server upgrades). Something like, before upgrading, verify that the
subscriber is caught up with the publisher by comparing the current
WAL position on the publisher and pg_stat_subscription.received_lsn on
the subscriber.

Any better ideas or thoughts on the above?

[1] - 
https://www.postgresql.org/message-id/TYAPR01MB586619721863B7FFDAC4369FF550A%40TYAPR01MB5866.jpnprd01.prod.outlook.com
[2] - 
https://www.postgresql.org/message-id/CAHGQGwEsttg9P9LOOavoc9d6VB1zVmYgfBk%3DLjsk-UL9cEf-eA%40mail.gmail.com
[3] -
commit c6c333436491a292d56044ed6e167e2bdee015a2
Author: Andres Freund <and...@anarazel.de>
Date:   Mon Jun 5 18:53:41 2017 -0700

    Prevent possibility of panics during shutdown checkpoint.
[4] - https://www.postgresql.org/docs/devel/pgupgrade.html

-- 
With Regards,
Amit Kapila.

Logical walsenders don't process XLOG_CHECKPOINT_SHUTDOWN

Reply via email to