Hi,

On Thu, Mar 5, 2026 at 3:37 PM Xuneng Zhou <[email protected]> wrote:
>
> Hi Michael,
>
> Thanks for the detailed feedback!
>
> On Thu, Mar 5, 2026 at 11:58 AM Michael Paquier <[email protected]> wrote:
> >
> > On Wed, Mar 04, 2026 at 09:02:29AM +0800, Xuneng Zhou wrote:
> > > Just rebase.
> >
> > I have applied 0001, that simply moves some code around.
> >
>
> Thanks for applying.
>
> > Regarding 0002, I would recommend to not bump the catalog version in
> > catversion.h when sending a patch to the lists.  This task is left to
> > committers when the code gets merged into the tree.  And this is
> > annoying for submitters because it can create a lot of conflicts.
>
> I'll leave it untouched.
>
> > Using a set-returning function is I think wrong here, betraying the
> > representation of the recovery status as stored in the system.  We
> > know that there is only one state of recovery, fixed in shared memory.
> > Like the cousins of this new function, let's make thinks non-SRF,
> > returning one row all the time with PG_RETURN_NULL() if the conditions
> > for information display are not satisfied.  When we are not in
> > recovery or when the role querying the function is not granted
> > ROLE_PG_READ_ALL_STATS, that will simplify the patch as there is no
> > need for the else branch with the nulls, as written in your patch.
> > The field values are acquired the right way, spinlock acquisitions
> > have to be short.
>
> My earlier thought for keeping pg_stat_get_recovery as an SRF is to
> make pg_stat_recovery simpler by avoiding an extra filter such as
> WHERE s.promote_triggered IS NOT NULL to preserve 0/1-row semantics.
> pg_stat_get_recovery_prefetch also uses the SRF pattern.
>
> > Like pg_stat_wal_receiver, let's add to the view definition a qual to
> > return a row only if the fields are not NULL.
> >
> > pg_get_wal_replay_pause_state() displays the pause_state, and it is
> > not hidden behind the stats read role.  I am not really convinced that
> > this is worth bothering to treat as an exception in this patch.  It
> > makes it a bit more complicated, for little gain.  I would just hide
> > all the fields behind the role granted, to keep the code simpler, even
> > if that's slightly inconsistent with pg_get_wal_replay_pause_state().
>
> I agree that exposing a subset of columns unconditionally is not worth
> the added complexity.
>
> > After writing this last point, as far as I can see there is a small
> > optimization possible in the patch.  When a role is not granted
> > ROLE_PG_READ_ALL_STATS, we know that we will not return any
> > information so we could skip the spinlock acquisition and avoid
> > spinlock acquisitions when one queries the function but has no access
> > to its data.
>
> Makes sense to me. I'll apply it as suggested.
>
> > +       True if a promotion has been triggered for this standby server.
> >
> > Standby servers are implied if data is returned, this sentence can be
> > simpler and cut the "standby server" part.
>
> + 1
>
> > +       Start LSN of the last WAL record replayed during recovery.
> > [...]
> > +       End LSN of the last WAL record replayed during recovery.
> > [...]
> > +       Timeline of the last replayed WAL record.
> > For other system views with LSN information, we don't use "LSN", but
> > "write-ahead log location".  I'd suggest the same term here.  These
> > three fields refer to the last record *successfully* replayed.  It
> > seems important to mention this fact, I guess?
>
> I'll replace them.
>
> > +       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
> > +      </para>
> > +      <para>
> > +       Current replay position. When replaying a record, this is the end
> > +       position of the record being replayed; otherwise it equals
> > +       <structfield>last_replayed_end_lsn</structfield>.
> >
> > Slightly inexact.  This is the end LSN + 1.
>
> Yeh, this needs to be corrected.
>
> > +       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
> > [..]
> > +       <structfield>replay_end_tli</structfield> <type>integer</type>
> >
> > Okay, I can fall behind the addition of these two, it could be helpful
> > in debugging something like a locking issue when replaying a specific
> > record, I guess.  At least I'd want to know what is happening for a
> > record currently being replayed.  It seems to me that this could be
> > more precise, mentioning that these refer to a record *currently*
> > being replayed.
>
> I will adjust the docs to say these describe the record currently
> being replayed, with replay_end_lsn being the end position + 1.
>
> > Is recoveryLastXTime actually that relevant to have?  We use it for
> > logging and for recovery target times, which is something less
> > relevant than the other fields, especially if we think about standbys
> > where these have no targets to reach.
>
> I agree it is less central for standby monitoring (and partly overlaps
> with pg_last_xact_replay_timestamp()), so I’ll remove it from this
> view in the next revision.
>
> > currentChunkStartTime, on the contrary, is much more relevant to me,
> > due to the fact that we use it in WaitForWALToBecomeAvailable() with
> > active WAL receivers running.
>
> Yeah, it could be useful for apply-delay/catch-up diagnostics.
>

Here is the updated patch set. Please take a look.


--
Best,
Xuneng
From f23f2662f41d8456a1450c98e4dd1b50ea43ed99 Mon Sep 17 00:00:00 2001
From: alterego655 <[email protected]>
Date: Tue, 27 Jan 2026 12:11:26 +0800
Subject: [PATCH v3 1/3] Add pg_stat_recovery system view

Introduce pg_stat_recovery to expose WAL recovery state maintained by the
startup process, including:
- last replayed record boundaries and timeline
- current replay end pointer and timeline
- current WAL chunk start time
- promotion trigger state
- recovery pause state

Implement pg_stat_get_recovery() as a non-SRF, following the
pg_stat_get_wal_receiver() pattern.

Gate all output behind pg_read_all_stats. If the caller lacks that role,
or recovery is not in progress, return NULL immediately before taking
XLogRecoveryCtl->info_lck.
---
 doc/src/sgml/monitoring.sgml           | 150 +++++++++++++++++++++++++
 src/backend/access/transam/xlogfuncs.c |  98 ++++++++++++++++
 src/backend/catalog/system_views.sql   |  13 +++
 src/include/catalog/pg_proc.dat        |   7 ++
 src/test/regress/expected/rules.out    |  10 ++
 src/test/regress/expected/sysviews.out |   7 ++
 src/test/regress/sql/sysviews.sql      |   3 +
 7 files changed, 288 insertions(+)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcf6e6a2f48..3bace7d1430 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -338,6 +338,15 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
       </entry>
      </row>
 
+     <row>
+      <entry><structname>pg_stat_recovery</structname><indexterm><primary>pg_stat_recovery</primary></indexterm></entry>
+      <entry>Only one row, showing statistics about the recovery state of the
+       startup process.  This view returns no row when not in recovery.
+       See <link linkend="monitoring-pg-stat-recovery-view">
+       <structname>pg_stat_recovery</structname></link> for details.
+      </entry>
+     </row>
+
      <row>
       <entry><structname>pg_stat_recovery_prefetch</structname><indexterm><primary>pg_stat_recovery_prefetch</primary></indexterm></entry>
       <entry>Only one row, showing statistics about blocks prefetched during recovery.
@@ -1912,6 +1921,147 @@ description | Waiting for a newly initialized WAL file to reach durable storage
 
  </sect2>
 
+ <sect2 id="monitoring-pg-stat-recovery-view">
+  <title><structname>pg_stat_recovery</structname></title>
+
+  <indexterm>
+   <primary>pg_stat_recovery</primary>
+  </indexterm>
+
+  <para>
+   The <structname>pg_stat_recovery</structname> view will contain only
+   one row, showing statistics about the recovery state of the startup
+   process. This view returns no row when the server is not in recovery.
+  </para>
+
+  <para>
+   Some columns are restricted to members of the
+   <literal>pg_read_all_stats</literal> role; other users will see
+   <literal>NULL</literal> values for these columns.  The restricted columns
+   include LSN values, timeline IDs, and timestamps.  Basic operational status
+   columns (<structfield>promote_triggered</structfield>,
+   <structfield>pause_state</structfield>) are visible to all users.
+  </para>
+
+  <table id="pg-stat-recovery-view" xreflabel="pg_stat_recovery">
+   <title><structname>pg_stat_recovery</structname> View</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>promote_triggered</structfield> <type>boolean</type>
+      </para>
+      <para>
+       True if a promotion has been triggered.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_replayed_read_lsn</structfield> <type>pg_lsn</type>
+      </para>
+      <para>
+       Start write-ahead log location of the last successfully replayed
+       WAL record.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_replayed_end_lsn</structfield> <type>pg_lsn</type>
+      </para>
+      <para>
+       End write-ahead log location of the last successfully replayed
+       WAL record.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_replayed_tli</structfield> <type>integer</type>
+      </para>
+      <para>
+       Timeline of the last successfully replayed WAL record.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>replay_end_lsn</structfield> <type>pg_lsn</type>
+      </para>
+      <para>
+       Write-ahead log location of the record currently being replayed
+       (end position plus one).  When no record is being actively replayed,
+       equals <structfield>last_replayed_end_lsn</structfield>.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>replay_end_tli</structfield> <type>integer</type>
+      </para>
+      <para>
+       Timeline of the WAL record currently being replayed.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>current_chunk_start_time</structfield> <type>timestamp with time zone</type>
+      </para>
+      <para>
+       Time when the startup process observed that replay had caught up
+       with the latest received WAL chunk.  Used in recovery-conflict
+       timing and replay/apply-lag diagnostics.  NULL if not yet
+       available.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pause_state</structfield> <type>text</type>
+      </para>
+      <para>
+       Recovery pause state. Possible values are:
+      </para>
+       <itemizedlist>
+        <listitem>
+         <para>
+          <literal>not paused</literal>: Recovery is proceeding normally.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>pause requested</literal>: A pause has been requested
+          but recovery has not yet paused.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>paused</literal>: Recovery is paused.
+         </para>
+        </listitem>
+       </itemizedlist>
+      </entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+
+ </sect2>
+
  <sect2 id="monitoring-pg-stat-recovery-prefetch">
   <title><structname>pg_stat_recovery_prefetch</structname></title>
 
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2efe4105efb..142289d8071 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -22,10 +22,13 @@
 #include "access/xlog_internal.h"
 #include "access/xlogbackup.h"
 #include "access/xlogrecovery.h"
+#include "catalog/pg_authid.h"
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "storage/spin.h"
+#include "utils/acl.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/latch.h"
@@ -748,3 +751,98 @@ pg_promote(PG_FUNCTION_ARGS)
 						   wait_seconds)));
 	PG_RETURN_BOOL(false);
 }
+
+/*
+ * pg_stat_get_recovery - returns information about WAL recovery state
+ *
+ * Returns NULL when not in recovery or when the caller lacks
+ * pg_read_all_stats privileges; one row otherwise.
+ */
+Datum
+pg_stat_get_recovery(PG_FUNCTION_ARGS)
+{
+	TupleDesc	tupdesc;
+	Datum	   *values;
+	bool	   *nulls;
+
+	/* Local copies of shared state */
+	bool		promote_triggered;
+	XLogRecPtr	last_replayed_read_lsn;
+	XLogRecPtr	last_replayed_end_lsn;
+	TimeLineID	last_replayed_tli;
+	XLogRecPtr	replay_end_lsn;
+	TimeLineID	replay_end_tli;
+	TimestampTz current_chunk_start_time;
+	RecoveryPauseState pause_state;
+
+	if (!RecoveryInProgress())
+		PG_RETURN_NULL();
+
+	if (!has_privs_of_role(GetUserId(), ROLE_PG_READ_ALL_STATS))
+		PG_RETURN_NULL();
+
+	/* Take a lock to ensure value consistency */
+	SpinLockAcquire(&XLogRecoveryCtl->info_lck);
+	promote_triggered = XLogRecoveryCtl->SharedPromoteIsTriggered;
+	last_replayed_read_lsn = XLogRecoveryCtl->lastReplayedReadRecPtr;
+	last_replayed_end_lsn = XLogRecoveryCtl->lastReplayedEndRecPtr;
+	last_replayed_tli = XLogRecoveryCtl->lastReplayedTLI;
+	replay_end_lsn = XLogRecoveryCtl->replayEndRecPtr;
+	replay_end_tli = XLogRecoveryCtl->replayEndTLI;
+	current_chunk_start_time = XLogRecoveryCtl->currentChunkStartTime;
+	pause_state = XLogRecoveryCtl->recoveryPauseState;
+	SpinLockRelease(&XLogRecoveryCtl->info_lck);
+
+	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	values = palloc0_array(Datum, tupdesc->natts);
+	nulls = palloc0_array(bool, tupdesc->natts);
+
+	values[0] = BoolGetDatum(promote_triggered);
+
+	if (XLogRecPtrIsValid(last_replayed_read_lsn))
+		values[1] = LSNGetDatum(last_replayed_read_lsn);
+	else
+		nulls[1] = true;
+
+	if (XLogRecPtrIsValid(last_replayed_end_lsn))
+		values[2] = LSNGetDatum(last_replayed_end_lsn);
+	else
+		nulls[2] = true;
+
+	if (XLogRecPtrIsValid(last_replayed_end_lsn))
+		values[3] = Int32GetDatum(last_replayed_tli);
+	else
+		nulls[3] = true;
+
+	if (XLogRecPtrIsValid(replay_end_lsn))
+		values[4] = LSNGetDatum(replay_end_lsn);
+	else
+		nulls[4] = true;
+
+	if (XLogRecPtrIsValid(replay_end_lsn))
+		values[5] = Int32GetDatum(replay_end_tli);
+	else
+		nulls[5] = true;
+
+	if (current_chunk_start_time != 0)
+		values[6] = TimestampTzGetDatum(current_chunk_start_time);
+	else
+		nulls[6] = true;
+
+	switch (pause_state)
+	{
+		case RECOVERY_NOT_PAUSED:
+			values[7] = CStringGetTextDatum("not paused");
+			break;
+		case RECOVERY_PAUSE_REQUESTED:
+			values[7] = CStringGetTextDatum("pause requested");
+			break;
+		case RECOVERY_PAUSED:
+			values[7] = CStringGetTextDatum("paused");
+			break;
+	}
+
+	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 1ea8f1faa9e..3aa77558bd6 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1005,6 +1005,19 @@ CREATE VIEW pg_stat_wal_receiver AS
     FROM pg_stat_get_wal_receiver() s
     WHERE s.pid IS NOT NULL;
 
+CREATE VIEW pg_stat_recovery AS
+    SELECT
+            s.promote_triggered,
+            s.last_replayed_read_lsn,
+            s.last_replayed_end_lsn,
+            s.last_replayed_tli,
+            s.replay_end_lsn,
+            s.replay_end_tli,
+            s.current_chunk_start_time,
+            s.pause_state
+    FROM pg_stat_get_recovery() s
+    WHERE s.promote_triggered IS NOT NULL;
+
 CREATE VIEW pg_stat_recovery_prefetch AS
     SELECT
             s.stats_reset,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index dac40992cbc..514ec8dc5ad 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5698,6 +5698,13 @@
   proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
   proargnames => '{pid,status,receive_start_lsn,receive_start_tli,written_lsn,flushed_lsn,received_tli,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time,slot_name,sender_host,sender_port,conninfo}',
   prosrc => 'pg_stat_get_wal_receiver' },
+{ oid => '9949', descr => 'statistics: information about WAL recovery',
+  proname => 'pg_stat_get_recovery', proisstrict => 'f', provolatile => 's',
+  proparallel => 'r', prorettype => 'record', proargtypes => '',
+  proallargtypes => '{bool,pg_lsn,pg_lsn,int4,pg_lsn,int4,timestamptz,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o}',
+  proargnames => '{promote_triggered,last_replayed_read_lsn,last_replayed_end_lsn,last_replayed_tli,replay_end_lsn,replay_end_tli,current_chunk_start_time,pause_state}',
+  prosrc => 'pg_stat_get_recovery' },
 { oid => '6169', descr => 'statistics: information about replication slot',
   proname => 'pg_stat_get_replication_slot', provolatile => 's',
   proparallel => 'r', prorettype => 'record', proargtypes => 'text',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 78a37d9fc8f..1150dc0ebf2 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2127,6 +2127,16 @@ pg_stat_progress_vacuum| SELECT s.pid,
         END AS started_by
    FROM (pg_stat_get_progress_info('VACUUM'::text) s(pid, datid, relid, param1, param2, param3, param4, param5, param6, param7, param8, param9, param10, param11, param12, param13, param14, param15, param16, param17, param18, param19, param20)
      LEFT JOIN pg_database d ON ((s.datid = d.oid)));
+pg_stat_recovery| SELECT promote_triggered,
+    last_replayed_read_lsn,
+    last_replayed_end_lsn,
+    last_replayed_tli,
+    replay_end_lsn,
+    replay_end_tli,
+    current_chunk_start_time,
+    pause_state
+   FROM pg_stat_get_recovery() s(promote_triggered, last_replayed_read_lsn, last_replayed_end_lsn, last_replayed_tli, replay_end_lsn, replay_end_tli, current_chunk_start_time, pause_state)
+  WHERE (s.promote_triggered IS NOT NULL);
 pg_stat_recovery_prefetch| SELECT stats_reset,
     prefetch,
     hit,
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 3dd63fd88ed..132b56a5864 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -143,6 +143,13 @@ select count(*) = 0 as ok from pg_stat_wal_receiver;
  t
 (1 row)
 
+-- We expect no recovery state in this test (running on primary)
+select count(*) = 0 as ok from pg_stat_recovery;
+ ok 
+----
+ t
+(1 row)
+
 -- This is to record the prevailing planner enable_foo settings during
 -- a regression test run.
 select name, setting from pg_settings where name like 'enable%';
diff --git a/src/test/regress/sql/sysviews.sql b/src/test/regress/sql/sysviews.sql
index 004f9a70e00..507e400ad4a 100644
--- a/src/test/regress/sql/sysviews.sql
+++ b/src/test/regress/sql/sysviews.sql
@@ -76,6 +76,9 @@ select count(*) = 1 as ok from pg_stat_wal;
 -- We expect no walreceiver running in this test
 select count(*) = 0 as ok from pg_stat_wal_receiver;
 
+-- We expect no recovery state in this test (running on primary)
+select count(*) = 0 as ok from pg_stat_recovery;
+
 -- This is to record the prevailing planner enable_foo settings during
 -- a regression test run.
 select name, setting from pg_settings where name like 'enable%';
-- 
2.51.0

From e5e1596cf711a7a52b529450401ad0d00285e05e Mon Sep 17 00:00:00 2001
From: alterego655 <[email protected]>
Date: Tue, 27 Jan 2026 13:59:29 +0800
Subject: [PATCH v3 2/3] Refactor: move XLogSource enum to xlogrecovery.h

Move the XLogSource enum definition from xlogrecovery.c to the public
header xlogrecovery.h to make it available for external use.

This is preparation for exposing WAL source information via the
pg_stat_recovery view. The xlogSourceNames array remains in the
implementation file as it's only used for debugging output.

No functional change.
---
 src/backend/access/transam/xlogrecovery.c | 12 ------------
 src/include/access/xlogrecovery.h         | 12 ++++++++++++
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 31806dcf008..680caee7a43 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -204,18 +204,6 @@ typedef struct XLogPageReadPrivate
 /* flag to tell XLogPageRead that we have started replaying */
 static bool InRedo = false;
 
-/*
- * Codes indicating where we got a WAL file from during recovery, or where
- * to attempt to get one.
- */
-typedef enum
-{
-	XLOG_FROM_ANY = 0,			/* request to read WAL from any source */
-	XLOG_FROM_ARCHIVE,			/* restored using restore_command */
-	XLOG_FROM_PG_WAL,			/* existing file in pg_wal */
-	XLOG_FROM_STREAM,			/* streamed from primary */
-} XLogSource;
-
 /* human-readable names for XLogSources, for debugging output */
 static const char *const xlogSourceNames[] = {"any", "archive", "pg_wal", "stream"};
 
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 3d1ee491f39..514595f0ee6 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -61,6 +61,18 @@ typedef enum RecoveryPauseState
 	RECOVERY_PAUSED,			/* recovery is paused */
 } RecoveryPauseState;
 
+/*
+ * Codes indicating where we got a WAL file from during recovery, or where
+ * to attempt to get one.
+ */
+typedef enum XLogSource
+{
+	XLOG_FROM_ANY = 0,			/* request to read WAL from any source */
+	XLOG_FROM_ARCHIVE,			/* restored using restore_command */
+	XLOG_FROM_PG_WAL,			/* existing file in pg_wal */
+	XLOG_FROM_STREAM,			/* streamed from primary */
+} XLogSource;
+
 /*
  * Shared-memory state for WAL recovery.
  */
-- 
2.51.0

From 0bb0b2f4af6abdb7bc2f256a44f80f1dd46ad6b3 Mon Sep 17 00:00:00 2001
From: alterego655 <[email protected]>
Date: Tue, 27 Jan 2026 14:02:52 +0800
Subject: [PATCH v3 3/3] Add wal_source column to pg_stat_recovery

Extend pg_stat_recovery with a wal_source column that shows where the
startup process most recently read WAL data from: 'archive', 'pg_wal',
or 'stream'.

This helps diagnose recovery behavior:
- Detecting streaming vs archive fallback transitions
- Monitoring initial standby catch-up progress
- Troubleshooting replication lag sources

The column reflects the current read source, not the original delivery
mechanism. Streamed WAL that is subsequently read from local files
shows 'pg_wal'. NULL if no WAL has been read yet.
---
 doc/src/sgml/monitoring.sgml              | 36 +++++++++++++++++++++++
 src/backend/access/transam/xlogfuncs.c    | 18 ++++++++++++
 src/backend/access/transam/xlogrecovery.c |  6 ++++
 src/backend/catalog/system_views.sql      |  3 +-
 src/include/access/xlogrecovery.h         |  8 +++++
 src/include/catalog/pg_proc.dat           |  6 ++--
 src/test/regress/expected/rules.out       |  5 ++--
 7 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 3bace7d1430..327515442a9 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2056,6 +2056,42 @@ description | Waiting for a newly initialized WAL file to reach durable storage
       </entry>
      </row>
 
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_source</structfield> <type>text</type>
+      </para>
+      <para>
+       Source from which the startup process most recently read WAL data.
+       Possible values are:
+      </para>
+       <itemizedlist>
+        <listitem>
+         <para>
+          <literal>archive</literal>: WAL restored using
+          <varname>restore_command</varname>.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>pg_wal</literal>: WAL read from local
+          <filename>pg_wal</filename> directory.
+         </para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>stream</literal>: WAL actively being streamed from the
+          upstream server.
+         </para>
+        </listitem>
+       </itemizedlist>
+      <para>
+       NULL if no WAL has been read yet.  Note that this reflects the
+       current read source, not the original delivery mechanism; streamed
+       WAL that is subsequently read from local files will show
+       <literal>pg_wal</literal>.
+      </para></entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 142289d8071..cdf7e9074e5 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -774,6 +774,7 @@ pg_stat_get_recovery(PG_FUNCTION_ARGS)
 	TimeLineID	replay_end_tli;
 	TimestampTz current_chunk_start_time;
 	RecoveryPauseState pause_state;
+	XLogSource	wal_source;
 
 	if (!RecoveryInProgress())
 		PG_RETURN_NULL();
@@ -791,6 +792,7 @@ pg_stat_get_recovery(PG_FUNCTION_ARGS)
 	replay_end_tli = XLogRecoveryCtl->replayEndTLI;
 	current_chunk_start_time = XLogRecoveryCtl->currentChunkStartTime;
 	pause_state = XLogRecoveryCtl->recoveryPauseState;
+	wal_source = XLogRecoveryCtl->lastReadSource;
 	SpinLockRelease(&XLogRecoveryCtl->info_lck);
 
 	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
@@ -844,5 +846,21 @@ pg_stat_get_recovery(PG_FUNCTION_ARGS)
 			break;
 	}
 
+	switch (wal_source)
+	{
+		case XLOG_FROM_ANY:
+			nulls[8] = true;
+			break;
+		case XLOG_FROM_ARCHIVE:
+			values[8] = CStringGetTextDatum("archive");
+			break;
+		case XLOG_FROM_PG_WAL:
+			values[8] = CStringGetTextDatum("pg_wal");
+			break;
+		case XLOG_FROM_STREAM:
+			values[8] = CStringGetTextDatum("stream");
+			break;
+	}
+
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 680caee7a43..c5ebdca8379 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -397,6 +397,7 @@ XLogRecoveryShmemInit(void)
 	memset(XLogRecoveryCtl, 0, sizeof(XLogRecoveryCtlData));
 
 	SpinLockInit(&XLogRecoveryCtl->info_lck);
+	XLogRecoveryCtl->lastReadSource = XLOG_FROM_ANY;
 	InitSharedLatch(&XLogRecoveryCtl->recoveryWakeupLatch);
 	ConditionVariableInit(&XLogRecoveryCtl->recoveryNotPausedCV);
 }
@@ -4249,6 +4250,11 @@ XLogFileRead(XLogSegNo segno, TimeLineID tli,
 		if (source != XLOG_FROM_STREAM)
 			XLogReceiptTime = GetCurrentTimestamp();
 
+		/* Update shared memory for external visibility */
+		SpinLockAcquire(&XLogRecoveryCtl->info_lck);
+		XLogRecoveryCtl->lastReadSource = source;
+		SpinLockRelease(&XLogRecoveryCtl->info_lck);
+
 		return fd;
 	}
 	if (errno != ENOENT || !notfoundOk) /* unexpected failure? */
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 3aa77558bd6..d28e665cbb1 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1014,7 +1014,8 @@ CREATE VIEW pg_stat_recovery AS
             s.replay_end_lsn,
             s.replay_end_tli,
             s.current_chunk_start_time,
-            s.pause_state
+            s.pause_state,
+            s.wal_source
     FROM pg_stat_get_recovery() s
     WHERE s.promote_triggered IS NOT NULL;
 
diff --git a/src/include/access/xlogrecovery.h b/src/include/access/xlogrecovery.h
index 514595f0ee6..f18922271a1 100644
--- a/src/include/access/xlogrecovery.h
+++ b/src/include/access/xlogrecovery.h
@@ -133,6 +133,14 @@ typedef struct XLogRecoveryCtlData
 	RecoveryPauseState recoveryPauseState;
 	ConditionVariable recoveryNotPausedCV;
 
+	/*
+	 * Source from which the startup process most recently read WAL data.
+	 * Updated when the startup process successfully reads WAL from a source.
+	 * Note: this reflects the read source, not the original receipt source;
+	 * streamed WAL read from local files will show XLOG_FROM_PG_WAL.
+	 */
+	XLogSource	lastReadSource;
+
 	slock_t		info_lck;		/* locks shared variables shown above */
 } XLogRecoveryCtlData;
 
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 514ec8dc5ad..f4c93139d77 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5701,9 +5701,9 @@
 { oid => '9949', descr => 'statistics: information about WAL recovery',
   proname => 'pg_stat_get_recovery', proisstrict => 'f', provolatile => 's',
   proparallel => 'r', prorettype => 'record', proargtypes => '',
-  proallargtypes => '{bool,pg_lsn,pg_lsn,int4,pg_lsn,int4,timestamptz,text}',
-  proargmodes => '{o,o,o,o,o,o,o,o}',
-  proargnames => '{promote_triggered,last_replayed_read_lsn,last_replayed_end_lsn,last_replayed_tli,replay_end_lsn,replay_end_tli,current_chunk_start_time,pause_state}',
+  proallargtypes => '{bool,pg_lsn,pg_lsn,int4,pg_lsn,int4,timestamptz,text,text}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o}',
+  proargnames => '{promote_triggered,last_replayed_read_lsn,last_replayed_end_lsn,last_replayed_tli,replay_end_lsn,replay_end_tli,current_chunk_start_time,pause_state,wal_source}',
   prosrc => 'pg_stat_get_recovery' },
 { oid => '6169', descr => 'statistics: information about replication slot',
   proname => 'pg_stat_get_replication_slot', provolatile => 's',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 1150dc0ebf2..994d184f5a4 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2134,8 +2134,9 @@ pg_stat_recovery| SELECT promote_triggered,
     replay_end_lsn,
     replay_end_tli,
     current_chunk_start_time,
-    pause_state
-   FROM pg_stat_get_recovery() s(promote_triggered, last_replayed_read_lsn, last_replayed_end_lsn, last_replayed_tli, replay_end_lsn, replay_end_tli, current_chunk_start_time, pause_state)
+    pause_state,
+    wal_source
+   FROM pg_stat_get_recovery() s(promote_triggered, last_replayed_read_lsn, last_replayed_end_lsn, last_replayed_tli, replay_end_lsn, replay_end_tli, current_chunk_start_time, pause_state, wal_source)
   WHERE (s.promote_triggered IS NOT NULL);
 pg_stat_recovery_prefetch| SELECT stats_reset,
     prefetch,
-- 
2.51.0

Reply via email to