Hi, On Wed, Jun 10, 2026 at 5:14 PM Fujii Masao <[email protected]> wrote: > > Hi, > > While working on a patch discussed in [1], I looked into how > log_startup_progress_interval behaves during recovery. During that > investigation, I noticed the following comment in > EnableStandbyMode(): > > /* > * To avoid server log bloat, we don't report recovery progress in a > * standby as it will always be in recovery unless promoted. We disable > * startup progress timeout in standby mode to avoid calling > * startup_progress_timeout_handler() unnecessarily. > */ > > So in standby mode, we intentionally suppress recovery progress > logging during WAL replay, because otherwise a standby could emit > progress messages indefinitely until promotion. > > However, some startup operations executed afterward, such as > ResetUnloggedRelations(), can re-enable the timeout. As a result, the > startup progress timeout can remain active during standby WAL replay, > which contradicts the intent described in the comment above.
Nice catch! Discussion for another thread: I wish we had these progress reports emitted even for the startup process - to help with post-hoc analysis of the customer issues about failover, slow replay, WAL growth on the primary, etc. However, I agree the volume of log records could grow unboundedly on the replica over its lifetime, because the same timeout parameter is being used at different scales. On the primary, assuming startup times are slower, one wants to know the recovery rate. On replicas, we surely want to have this info, say, every 5 min (288 logs per day), 10 min (144 logs per day), or 20 min (72 logs per day). I prefer to NOT add a new GUC for this, but perhaps the same GUC log_startup_progress_interval could log at different scales on the primary versus the replica. -- Bharath Rupireddy Amazon Web Services: https://aws.amazon.com
