Re: Add pg_stat_recovery system view

Xuneng Zhou Wed, 04 Mar 2026 23:37:40 -0800

Hi Michael,

Thanks for the detailed feedback!


On Thu, Mar 5, 2026 at 11:58 AM Michael Paquier <[email protected]> wrote:
>
> On Wed, Mar 04, 2026 at 09:02:29AM +0800, Xuneng Zhou wrote:
> > Just rebase.
>
> I have applied 0001, that simply moves some code around.
>

Thanks for applying.

> Regarding 0002, I would recommend to not bump the catalog version in
> catversion.h when sending a patch to the lists.  This task is left to
> committers when the code gets merged into the tree.  And this is
> annoying for submitters because it can create a lot of conflicts.

I'll leave it untouched.

> Using a set-returning function is I think wrong here, betraying the
> representation of the recovery status as stored in the system.  We
> know that there is only one state of recovery, fixed in shared memory.
> Like the cousins of this new function, let's make thinks non-SRF,
> returning one row all the time with PG_RETURN_NULL() if the conditions
> for information display are not satisfied.  When we are not in
> recovery or when the role querying the function is not granted
> ROLE_PG_READ_ALL_STATS, that will simplify the patch as there is no
> need for the else branch with the nulls, as written in your patch.
> The field values are acquired the right way, spinlock acquisitions
> have to be short.

My earlier thought for keeping pg_stat_get_recovery as an SRF is to
make pg_stat_recovery simpler by avoiding an extra filter such as
WHERE s.promote_triggered IS NOT NULL to preserve 0/1-row semantics.
pg_stat_get_recovery_prefetch also uses the SRF pattern.

> Like pg_stat_wal_receiver, let's add to the view definition a qual to
> return a row only if the fields are not NULL.
>
> pg_get_wal_replay_pause_state() displays the pause_state, and it is
> not hidden behind the stats read role.  I am not really convinced that
> this is worth bothering to treat as an exception in this patch.  It
> makes it a bit more complicated, for little gain.  I would just hide
> all the fields behind the role granted, to keep the code simpler, even
> if that's slightly inconsistent with pg_get_wal_replay_pause_state().

I agree that exposing a subset of columns unconditionally is not worth
the added complexity.

> After writing this last point, as far as I can see there is a small
> optimization possible in the patch.  When a role is not granted
> ROLE_PG_READ_ALL_STATS, we know that we will not return any
> information so we could skip the spinlock acquisition and avoid
> spinlock acquisitions when one queries the function but has no access
> to its data.

Makes sense to me. I'll apply it as suggested.

> +       True if a promotion has been triggered for this standby server.
>
> Standby servers are implied if data is returned, this sentence can be
> simpler and cut the "standby server" part.

+ 1

> +       Start LSN of the last WAL record replayed during recovery.
> [...]
> +       End LSN of the last WAL record replayed during recovery.
> [...]
> +       Timeline of the last replayed WAL record.
> For other system views with LSN information, we don't use "LSN", but
> "write-ahead log location".  I'd suggest the same term here.  These
> three fields refer to the last record *successfully* replayed.  It
> seems important to mention this fact, I guess?

I'll replace them.

> +       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
> +      </para>
> +      <para>
> +       Current replay position. When replaying a record, this is the end
> +       position of the record being replayed; otherwise it equals
> +       <structfield>last_replayed_end_lsn</structfield>.
>
> Slightly inexact.  This is the end LSN + 1.

Yeh, this needs to be corrected.

> +       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
> [..]
> +       <structfield>replay_end_tli</structfield> <type>integer</type>
>
> Okay, I can fall behind the addition of these two, it could be helpful
> in debugging something like a locking issue when replaying a specific
> record, I guess.  At least I'd want to know what is happening for a
> record currently being replayed.  It seems to me that this could be
> more precise, mentioning that these refer to a record *currently*
> being replayed.

I will adjust the docs to say these describe the record currently
being replayed, with replay_end_lsn being the end position + 1.

> Is recoveryLastXTime actually that relevant to have?  We use it for
> logging and for recovery target times, which is something less
> relevant than the other fields, especially if we think about standbys
> where these have no targets to reach.

I agree it is less central for standby monitoring (and partly overlaps
with pg_last_xact_replay_timestamp()), so I’ll remove it from this
view in the next revision.

> currentChunkStartTime, on the contrary, is much more relevant to me,
> due to the fact that we use it in WaitForWALToBecomeAvailable() with
> active WAL receivers running.

Yeah, it could be useful for apply-delay/catch-up diagnostics.

-- 
Best,
Xuneng

Re: Add pg_stat_recovery system view

Reply via email to