Re: pg_stat_replication lag fields return non-NULL values even with NULL LSNs

Thomas Munro Mon, 12 Aug 2019 16:17:10 -0700

On Wed, Jul 17, 2019 at 1:52 PM Michael Paquier <mich...@paquier.xyz> wrote:
> I got surprised by the following behavior from pg_stat_get_wal_senders
> when connecting for example pg_receivewal to a primary:
> =# select application_name, flush_lsn, replay_lsn, flush_lag,
> replay_lag from pg_stat_replication;
>  application_name | flush_lsn | replay_lsn |    flush_lag    |    replay_lag
> ------------------+-----------+------------+-----------------+-----------------
>  receivewal       | null      | null       | 00:09:13.578185 | 00:09:13.578185
> (1 row)
>
> It makes little sense to me, as we are reporting a replay lag on a
> position which has never been reported yet, so it cannot actually be
> used as a comparison base for the lag.  Am I missing something or
> should we return NULL for those fields if we have no write, flush or
> apply LSNs like in the attached?


Hmm.  It's working as designed, but indeed it's not very newsworthy
information in this case.  If you run pg_receivewal --synchronous then
you get sensible looking flush_lag times.  Without that, flush_lag
only goes up, and of course replay_lag only goes up, so although it's
telling the truth, I think your proposal makes sense.

One question I had is what would happen with your patch without
--synchronous, once it flushes a whole file and opens a new one; I
wondered if your new boring-information-hiding behaviour would stop
working after one segment file because of that.  I tested that and the
column remains NULL when we move to a new file, so that's good.

One thing I noticed in passing is that you always get the same times
in the write_lag and flush_lag columns, in --synchronous mode, and the
times updates infrequently.  That's not the case with regular
replicas; I suspect there is a difference in the time and frequency of
replies sent to the server, which I guess might make synchronous
commit a bit "lumpier", but I didn't dig further today.

-- 
Thomas Munro
https://enterprisedb.com

Re: pg_stat_replication lag fields return non-NULL values even with NULL LSNs

Reply via email to