On Tue, 31 Dec 2024 at 10:15, Masahiko Sawada <sawada.m...@gmail.com> wrote: > > Hi all, > > Logical decoding (and logical replication) are available only when > wal_level = logical. As the documentation says[1], Using the 'logical' > level increases the WAL volume which could negatively affect the > performance. For that reason, users might want to start with using > 'replica', but when they want to use logical decoding they need a > server restart to increase wal_level to 'logical'. My goal is to allow > users who are using 'replica' level to use logical decoding without a > server restart. There are other GUC parameters related to logical > decoding and logical replication such as max_wal_senders, > max_logical_replication_workers, and max_replication_slots, but even > if users set these parameters >0, there would not be a noticeable > performance impact. And their default values are already >0. So I'd > like to focus on making only the wal_level dynamic GUC parameter. > There are several earlier discussions[2][3] but no one has submitted > patches unless I'm missing something. > > The first idea I came up with is to make the wal_level a PGC_SIGHUP > parameter. However, it affects not only setting 'replica' to 'logical' > but also setting 'minimal' to 'replica' or higher. I'm not sure the > latter case is common and it might require a checkpoint. I don't want > to make the patch complex for uncommon cases. > > The second idea is to somehow allow both WAL-logging logical info and > logical decoding even when wal_level is 'replica'. I've attached a PoC > patch for that. The patch introduces new SQL functions such as > pg_activate_logical_decoding() and pg_deactivate_logical_decoding(). > These functions are available only when wal_level is 'repilca'(or > higher). In pg_activate_logical_decoding(), we set the status of > logical decoding stored on the shared memory from 'disabled' to > 'xlog-logical-info', allowing all processes to write logical > information to WAL records for logical decoding. But the logical > decoding is still not allowed. Once we confirm all in-progress > transactions completed, we switch the status to > 'logical-decoding-ready', meaning that users can create logical > replication slots and use logical decoding.
I felt that the only disadvantage with this approach is that we currently wait for all in-progress transactions to complete before enabling logical decoding. If a long-running transaction exists and the session enabling logical decoding fails—due to factors like statement_timeout, transaction_timeout, idle_in_transaction_session_timeout, idle_session_timeout, or any other failure. This would require restarting the wait. During this time, there's a risk that a new long-running transaction could start, further delaying the process. Probably this could be solved if the waiting is done from any of the background processes through PGC_SIGHUP. While this type of failure is likely rare, I’m unsure whether we should consider this scenario. Thoughts? Regards, Vignesh