On Wed, Mar 27, 2024 at 1:47 AM Euler Taveira <eu...@eulerto.com> wrote: > > On Tue, Mar 26, 2024, at 4:12 PM, Tomas Vondra wrote: > > Perhaps I'm missing something, but why is NUM_CONN_ATTEMPTS even needed? > Why isn't recovery_timeout enough to decide if wait_for_end_recovery() > waited long enough? > > > It was an attempt to decoupled a connection failure (that keeps streaming the > WAL) from recovery timeout. The NUM_CONN_ATTEMPTS guarantees that if the > primary > is gone during the standby recovery process, there is a way to bail out. >
I think we don't need to check primary if the WAL corresponding to consistent_lsn is already present on the standby. Shouldn't we first check that? Once we ensure that the required WAL is copied, just checking server_is_in_recovery() should be sufficient. I feel that will be a direct way of ensuring what is required rather than indirectly verifying the same (by checking pg_stat_wal_receiver) as we are doing currently. -- With Regards, Amit Kapila.