Hi,

On Mon, Apr 21, 2025 at 10:31:03AM -0700, Masahiko Sawada wrote:
> I would like to discuss behavioral and user interface considerations.
> 
> Upon further analysis of this patch regarding the conversion of
> wal_level to a SIGHUP parameter, I find that supporting all
> combinations of wal_level value changes might make less sense.
> Specifically, changing to or from 'minimal' would necessitate a
> checkpoint, and reducing wal_level to 'minimal' would require
> terminating physical replication, WAL archiving, and online backups.
> While these operations demand careful consideration, there seems to be
> no compelling use case for decreasing to 'minimal'. Furthermore,
> increasing wal_level from 'minimal' is typically a one-time operation
> during a database's lifetime. Therefore, we should weigh the benefits
> against the implementation complexity.

Agree.

> One solution is to manage the effective WAL level using two distinct
> GUC parameters: max_wal_level and wal_level. max_wal_level would be a
> POSTMASTER parameter controlling the system's maximum allowable WAL
> level, with values 'minimal', 'replica', and 'logical'. wal_level
> would function as a SIGHUP parameter managing the runtime WAL level,
> accepting values 'replica', 'logical', and 'auto'. The selected value
> must be either 'auto' or not exceed max_wal_level. When set to 'auto',
> wal_level automatically synchronizes with max_wal_level's value. This
> approach would enable online WAL level transitions between 'replica'
> and 'logical'.

That makes sense to me. I think that 'logical' could be the default value
for max_wal_level and 'replica' the default for wal_level.
I think that would provide almost the same user experience as currently and 
would
allow replica->logical change without restart. Thoughts?

> Regarding logical decoding on standbys, currently both primary and
> standby servers must have wal_level set to 'logical'. We need to
> determine the appropriate behavior when users decrease the WAL level
> from 'logical' to 'replica' through configuration file reload.
> 
> One approach would be to invalidate all logical replication slots on
> the standby when transitioning to 'replica' WAL level. Although
> incoming WAL records from the primary would still be written at
> 'logical' level, making logical decoding technically feasible, this
> behavior seems logical as it reflects the user's intent to discontinue
> logical decoding on the standby.

+1

> For consistency, we might need to
> invalidate logical slots during server startup if the WAL level is
> insufficient.

Not sure. Currently we'd not allow the standby to start:

"
LOG:  entering standby mode
FATAL:  logical replication slot "logical_slot" exists, but "wal_level" < 
"logical"
HINT:  Change "wal_level" to be "logical" or higher.
LOG:  startup process (PID 1790508) exited with exit code 1
"

I think that's a good guard for configuration change mistakes. If that's a 
mistake
change back to logical and start. If that's not a mistake then change back to
logical, start, change with SIGHUP. OTOH I also see the benefits of being 
consistent
between SIGHUP and start.

> Alternatively, we could permit logical decoding on the standby even
> with wal_level set to 'replica'.

Yeah, technically speaking we could as the WALs are coming from the primary 
(that
has wal_level set to logical).

> However, this would necessitate
> invalidating all logical replication slots during promotion,
> potentially extending downtime during failover.

Yeah, I'm tempted to vote to not allow logical decoding on the standby if the
wal_level is not logical.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Reply via email to