On 2021-02-10 00:51, David G. Johnston wrote:
On Thu, Feb 4, 2021 at 4:45 PM Masahiro Ikeda
<ikeda...@oss.nttdata.com> wrote:
I pgindented the patches.
... <function>XLogWrite</function>, which is invoked during an
<function>XLogFlush</function> request (see ...). This is also
incremented by the WAL receiver during replication.
("which normally called" should be "which is normally called" or
"which normally is called" if you want to keep true to the original)
You missed the adding the space before an opening parenthesis here and
elsewhere (probably copy-paste)
is ether -> is either
"This parameter is off by default as it will repeatedly query the
operating system..."
", because" -> "as"
Thanks, I fixed them.
wal_write_time and the sync items also need the note: "This is also
incremented by the WAL receiver during replication."
I skipped changing it since I separated the stats for the WAL receiver
in pg_stat_wal_receiver.
"The number of times it happened..." -> " (the tally of this event is
reported in wal_buffers_full in....) This is undesirable because ..."
Thanks, I fixed it.
I notice that the patch for WAL receiver doesn't require explicitly
computing the sync statistics but does require computing the write
statistics. This is because of the presence of issue_xlog_fsync but
absence of an equivalent pg_xlog_pwrite. Additionally, I observe that
the XLogWrite code path calls pgstat_report_wait_*() while the WAL
receiver path does not. It seems technically straight-forward to
refactor here to avoid the almost-duplicated logic in the two places,
though I suspect there may be a trade-off for not adding another
function call to the stack given the importance of WAL processing
(though that seems marginalized compared to the cost of actually
writing the WAL). Or, as Fujii noted, go the other way and don't have
any shared code between the two but instead implement the WAL receiver
one to use pg_stat_wal_receiver instead. In either case, this
half-and-half implementation seems undesirable.
OK, as Fujii-san mentioned, I separated the WAL receiver stats.
(v10-0002-Makes-the-wal-receiver-report-WAL-statistics.patch)
I added the infrastructure code to communicate the WAL receiver stats
messages between the WAL receiver and the stats collector, and
the stats for WAL receiver is counted in pg_stat_wal_receiver.
What do you think?
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
From 4ecbc467f88a2f923b3bc2eb6fe2c4f7725c02be Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikeda...@oss.nttdata.com>
Date: Fri, 12 Feb 2021 11:19:59 +0900
Subject: [PATCH 1/2] Add statistics related to write/sync wal records.
This patch adds following statistics to pg_stat_wal view
to track WAL I/O activity by backends and background processes
except WAL receiver.
- the total number of times writing/syncing WAL data.
- the total amount of time spent writing/syncing WAL data.
Since to track I/O timing may leads significant overhead,
GUC parameter "track_wal_io_timing" is introduced.
Only if this is on, the I/O timing is measured.
The statistics related to sync are zero when "wal_sync_method"
is "open_datasync" or "open_sync", because it doesn't call each
sync method.
(This requires a catversion bump, as well as an update to
PGSTAT_FILE_FORMAT_ID)
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David
Johnston, Fujii Masao
Discussion:
https://postgr.es/m/0509ad67b585a5b86a83d445dfa75...@oss.nttdata.co
---
doc/src/sgml/config.sgml | 23 +++++++-
doc/src/sgml/monitoring.sgml | 50 ++++++++++++++++
doc/src/sgml/wal.sgml | 12 +++-
src/backend/access/transam/xlog.c | 57 +++++++++++++++++++
src/backend/catalog/system_views.sql | 4 ++
src/backend/postmaster/pgstat.c | 4 ++
src/backend/postmaster/walwriter.c | 16 ++++--
src/backend/utils/adt/pgstatfuncs.c | 24 +++++++-
src/backend/utils/misc/guc.c | 9 +++
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/access/xlog.h | 1 +
src/include/catalog/pg_proc.dat | 6 +-
src/include/pgstat.h | 10 ++++
src/test/regress/expected/rules.out | 6 +-
14 files changed, 208 insertions(+), 15 deletions(-)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5ef1c7ad3c..0bb03d0717 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7404,7 +7404,7 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
<listitem>
<para>
Enables timing of database I/O calls. This parameter is off by
- default, because it will repeatedly query the operating system for
+ default, as it will repeatedly query the operating system for
the current time, which may cause significant overhead on some
platforms. You can use the <xref linkend="pgtesttiming"/> tool to
measure the overhead of timing on your system.
@@ -7418,6 +7418,27 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-track-wal-io-timing" xreflabel="track_wal_io_timing">
+ <term><varname>track_wal_io_timing</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>track_wal_io_timing</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables timing of WAL I/O calls. This parameter is off by default,
+ as it will repeatedly query the operating system for
+ the current time, which may cause significant overhead on some
+ platforms. You can use the <xref linkend="pgtesttiming"/> tool to
+ measure the overhead of timing on your system.
+ I/O timing information is
+ displayed in <link linkend="monitoring-pg-stat-wal-view">
+ <structname>pg_stat_wal</structname></link>. Only superusers can
+ change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-track-functions" xreflabel="track_functions">
<term><varname>track_functions</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index c602ee4427..8617449977 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3487,6 +3487,56 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</para></entry>
</row>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of times WAL buffers were written out to disk via
+ <function>XLogWrite</function>, which is invoked during an
+ <function>XLogFlush</function> request (see <xref linkend="wal-configuration"/>).
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time spent writing WAL data to disk, excluding sync time unless
+ <xref linkend="guc-wal-sync-method"/> is either <literal>open_datasync</literal> or
+ <literal>open_sync</literal>. Units are in milliseconds with microsecond resolution.
+ This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of times WAL files were synced to disk via
+ <function>issue_xlog_fsync</function>, which normally called by an
+ <function>XLogFlush</function> request (see <xref linkend="wal-configuration"/>),
+ while <xref linkend="guc-wal-sync-method"/> was set to one of the
+ "sync at commit" options (i.e., <literal>fdatasync</literal>,
+ <literal>fsync</literal>, or <literal>fsync_writethrough</literal>).
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time spent syncing WAL files to disk, in milliseconds with microsecond
+ resolution. This requires setting <xref linkend="guc-wal-sync-method"/> to one of
+ the "sync at commit" options (i.e., <literal>fdatasync</literal>, <literal>fsync</literal>,
+ or <literal>fsync_writethrough</literal>).
+ This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>stats_reset</structfield> <type>timestamp with time zone</type>
diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
index 66de1ee2f8..bca0180d73 100644
--- a/doc/src/sgml/wal.sgml
+++ b/doc/src/sgml/wal.sgml
@@ -663,7 +663,9 @@
the <acronym>WAL</acronym> buffers in shared memory. If there is no
space for the new record, <function>XLogInsertRecord</function> will have
to write (move to kernel cache) a few filled <acronym>WAL</acronym>
- buffers. This is undesirable because <function>XLogInsertRecord</function>
+ buffers (the tally of this event is reported in
+ <literal>wal_buffers_full</literal> in <xref linkend="pg-stat-wal-view"/>).
+ This is undesirable because <function>XLogInsertRecord</function>
is used on every database low level modification (for example, row
insertion) at a time when an exclusive lock is held on affected
data pages, so the operation needs to be as fast as possible. What
@@ -672,8 +674,12 @@
time. Normally, <acronym>WAL</acronym> buffers should be written
and flushed by an <function>XLogFlush</function> request, which is
made, for the most part, at transaction commit time to ensure that
- transaction records are flushed to permanent storage. On systems
- with high log output, <function>XLogFlush</function> requests might
+ transaction records are flushed to permanent storage.
+ <function>XLogFlush</function> calls <function>XLogWrite</function> to write
+ and <function>issue_xlog_fsync</function> to flush them, which are counted as
+ <literal>wal_write</literal> and <literal>wal_sync</literal> in
+ <xref linkend="pg-stat-wal-view"/>. On systems with high log output,
+ <function>XLogFlush</function> requests might
not occur often enough to prevent <function>XLogInsertRecord</function>
from having to do writes. On such systems
one should increase the number of <acronym>WAL</acronym> buffers by
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 9e03ca842d..904018ed46 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -110,6 +110,7 @@ int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
int wal_retrieve_retry_interval = 5000;
int max_slot_wal_keep_size_mb = -1;
+bool track_wal_io_timing = false;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
@@ -2536,6 +2537,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
Size nbytes;
Size nleft;
int written;
+ instr_time start;
/* OK to write the page(s) */
from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
@@ -2544,9 +2546,30 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
do
{
errno = 0;
+
+ /* Measure I/O timing to write WAL data */
+ if (track_wal_io_timing)
+ INSTR_TIME_SET_CURRENT(start);
+
pgstat_report_wait_start(WAIT_EVENT_WAL_WRITE);
written = pg_pwrite(openLogFile, from, nleft, startoffset);
pgstat_report_wait_end();
+
+ /*
+ * Increment the I/O timing and the number of times WAL data
+ * were written out to disk.
+ */
+ if (track_wal_io_timing)
+ {
+ instr_time duration;
+
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_write_time += INSTR_TIME_GET_MICROSEC(duration);
+ }
+
+ WalStats.m_wal_write++;
+
if (written <= 0)
{
char xlogfname[MAXFNAMELEN];
@@ -10544,6 +10567,21 @@ void
issue_xlog_fsync(int fd, XLogSegNo segno)
{
char *msg = NULL;
+ bool issue_fsync = false;
+ instr_time start;
+
+ /* Check whether the WAL file was synced to disk right now */
+ if (enableFsync &&
+ (sync_method == SYNC_METHOD_FSYNC ||
+ sync_method == SYNC_METHOD_FSYNC_WRITETHROUGH ||
+ sync_method == SYNC_METHOD_FDATASYNC))
+ {
+ /* Measure I/O timing to sync the WAL file */
+ if (track_wal_io_timing)
+ INSTR_TIME_SET_CURRENT(start);
+
+ issue_fsync = true;
+ }
pgstat_report_wait_start(WAIT_EVENT_WAL_SYNC);
switch (sync_method)
@@ -10588,6 +10626,25 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
}
pgstat_report_wait_end();
+
+ /*
+ * Increment the I/O timing and the number of times WAL files were synced.
+ *
+ * Check whether the WAL file was synced to disk right now because
+ * statistics must be incremented when syncing really occurred.
+ */
+ if (issue_fsync)
+ {
+ if (track_wal_io_timing)
+ {
+ instr_time duration;
+
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_sync_time += INSTR_TIME_GET_MICROSEC(duration);
+ }
+ WalStats.m_wal_sync++;
+ }
}
/*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index fa58afd9d7..b8ace4fc41 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1004,6 +1004,10 @@ CREATE VIEW pg_stat_wal AS
w.wal_fpi,
w.wal_bytes,
w.wal_buffers_full,
+ w.wal_write,
+ w.wal_write_time,
+ w.wal_sync,
+ w.wal_sync_time,
w.stats_reset
FROM pg_stat_get_wal() w;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index f75b52719d..987bbd058d 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -6892,6 +6892,10 @@ pgstat_recv_wal(PgStat_MsgWal *msg, int len)
walStats.wal_fpi += msg->m_wal_fpi;
walStats.wal_bytes += msg->m_wal_bytes;
walStats.wal_buffers_full += msg->m_wal_buffers_full;
+ walStats.wal_write += msg->m_wal_write;
+ walStats.wal_write_time += msg->m_wal_write_time;
+ walStats.wal_sync += msg->m_wal_sync;
+ walStats.wal_sync_time += msg->m_wal_sync_time;
}
/* ----------
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index 4f1a8e356b..08fa7032c0 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -223,6 +223,7 @@ WalWriterMain(void)
for (;;)
{
long cur_timeout;
+ int rc;
/*
* Advertise whether we might hibernate in this cycle. We do this
@@ -263,9 +264,16 @@ WalWriterMain(void)
else
cur_timeout = WalWriterDelay * HIBERNATE_FACTOR;
- (void) WaitLatch(MyLatch,
- WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
- cur_timeout,
- WAIT_EVENT_WAL_WRITER_MAIN);
+ rc = WaitLatch(MyLatch,
+ WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
+ cur_timeout,
+ WAIT_EVENT_WAL_WRITER_MAIN);
+
+ /*
+ * Send WAL statistics only if WalWriterDelay has elapsed to minimize
+ * the overhead in WAL-writing.
+ */
+ if (rc & WL_TIMEOUT)
+ pgstat_send_wal();
}
}
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 62bff52638..7296ef04ff 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1799,7 +1799,7 @@ pg_stat_get_buf_alloc(PG_FUNCTION_ARGS)
Datum
pg_stat_get_wal(PG_FUNCTION_ARGS)
{
-#define PG_STAT_GET_WAL_COLS 5
+#define PG_STAT_GET_WAL_COLS 9
TupleDesc tupdesc;
Datum values[PG_STAT_GET_WAL_COLS];
bool nulls[PG_STAT_GET_WAL_COLS];
@@ -1820,7 +1820,15 @@ pg_stat_get_wal(PG_FUNCTION_ARGS)
NUMERICOID, -1, 0);
TupleDescInitEntry(tupdesc, (AttrNumber) 4, "wal_buffers_full",
INT8OID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 5, "stats_reset",
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "wal_write",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "wal_write_time",
+ FLOAT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "wal_sync",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "wal_sync_time",
+ FLOAT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "stats_reset",
TIMESTAMPTZOID, -1, 0);
BlessTupleDesc(tupdesc);
@@ -1840,7 +1848,17 @@ pg_stat_get_wal(PG_FUNCTION_ARGS)
Int32GetDatum(-1));
values[3] = Int64GetDatum(wal_stats->wal_buffers_full);
- values[4] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
+ values[4] = Int64GetDatum(wal_stats->wal_write);
+
+ /* convert counter from microsec to millisec for display */
+ values[5] = Float8GetDatum((double) wal_stats->wal_write_time / 1000.0);
+
+ values[6] = Int64GetDatum(wal_stats->wal_sync);
+
+ /* convert counter from microsec to millisec for display */
+ values[7] = Float8GetDatum((double) wal_stats->wal_sync_time / 1000.0);
+
+ values[8] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8735e36174..9a58922b8f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1485,6 +1485,15 @@ static struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+ {
+ {"track_wal_io_timing", PGC_SUSET, STATS_COLLECTOR,
+ gettext_noop("Collects timing statistics for WAL I/O activity."),
+ NULL
+ },
+ &track_wal_io_timing,
+ false,
+ NULL, NULL, NULL
+ },
{
{"update_process_title", PGC_SUSET, PROCESS_TITLE,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index bd57e917e1..a4c1d6650c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -585,6 +585,7 @@
#track_activities = on
#track_counts = on
#track_io_timing = off
+#track_wal_io_timing = off
#track_functions = none # none, pl, all
#track_activity_query_size = 1024 # (change requires restart)
#stats_temp_directory = 'pg_stat_tmp'
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 75ec1073bd..1e53d9d4ca 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -131,6 +131,7 @@ extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
extern bool wal_receiver_create_temp_slot;
+extern bool track_wal_io_timing;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ada55e7ad5..6962ffeef2 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5543,9 +5543,9 @@
{ oid => '1136', descr => 'statistics: information about WAL activity',
proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
proparallel => 'r', prorettype => 'record', proargtypes => '',
- proallargtypes => '{int8,int8,numeric,int8,timestamptz}',
- proargmodes => '{o,o,o,o,o}',
- proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,stats_reset}',
+ proallargtypes => '{int8,int8,numeric,int8,int8,float8,int8,float8,timestamptz}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o}',
+ proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_write,wal_write_time,wal_sync,wal_sync_time,stats_reset}',
prosrc => 'pg_stat_get_wal' },
{ oid => '2306', descr => 'statistics: information about SLRU caches',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 724068cf87..000bb14c0b 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -474,6 +474,12 @@ typedef struct PgStat_MsgWal
PgStat_Counter m_wal_fpi;
uint64 m_wal_bytes;
PgStat_Counter m_wal_buffers_full;
+ PgStat_Counter m_wal_write;
+ PgStat_Counter m_wal_write_time; /* time spend writing wal records in
+ * micro seconds */
+ PgStat_Counter m_wal_sync;
+ PgStat_Counter m_wal_sync_time; /* time spend syncing wal records in micro
+ * seconds */
} PgStat_MsgWal;
/* ----------
@@ -839,6 +845,10 @@ typedef struct PgStat_WalStats
PgStat_Counter wal_fpi;
uint64 wal_bytes;
PgStat_Counter wal_buffers_full;
+ PgStat_Counter wal_write;
+ PgStat_Counter wal_write_time;
+ PgStat_Counter wal_sync;
+ PgStat_Counter wal_sync_time;
TimestampTz stat_reset_timestamp;
} PgStat_WalStats;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index b632d9f2ea..2ad074f6a0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2158,8 +2158,12 @@ pg_stat_wal| SELECT w.wal_records,
w.wal_fpi,
w.wal_bytes,
w.wal_buffers_full,
+ w.wal_write,
+ w.wal_write_time,
+ w.wal_sync,
+ w.wal_sync_time,
w.stats_reset
- FROM pg_stat_get_wal() w(wal_records, wal_fpi, wal_bytes, wal_buffers_full, stats_reset);
+ FROM pg_stat_get_wal() w(wal_records, wal_fpi, wal_bytes, wal_buffers_full, wal_write, wal_write_time, wal_sync, wal_sync_time, stats_reset);
pg_stat_wal_receiver| SELECT s.pid,
s.status,
s.receive_start_lsn,
--
2.25.1
From f2d721d5bedcb916e12331fff316712af3176e0f Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikeda...@oss.nttdata.com>
Date: Mon, 15 Feb 2021 10:26:01 +0900
Subject: [PATCH 2/2] Makes the wal receiver report WAL statistics
This patch makes the WAL receiver report WAL statistics
and fundamentally changes how the stats collector's behaves
with regards to that function and how it interacts with
the WAL receiver.
(This requires a catversion bump, as well as an update to
PGSTAT_FILE_FORMAT_ID)
Author: Masahiro Ikeda
Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston,
Fujii Masao
Discussion:
https://postgr.es/m/0509ad67b585a5b86a83d445dfa75...@oss.nttdata.com
---
doc/src/sgml/monitoring.sgml | 86 +++++++++++++++++--
src/backend/access/transam/xlog.c | 12 ++-
src/backend/catalog/system_views.sql | 7 +-
src/backend/postmaster/pgstat.c | 114 +++++++++++++++++++++++++-
src/backend/replication/walreceiver.c | 46 +++++++++++
src/include/catalog/pg_proc.dat | 6 +-
src/include/pgstat.h | 40 ++++++++-
src/test/regress/expected/rules.out | 9 +-
8 files changed, 304 insertions(+), 16 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 8617449977..731db6d29b 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -2862,6 +2862,62 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
with security-sensitive fields obfuscated.
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of times WAL data were written out to disk by WAL receiver.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time spent writing WAL data to disk by WAL receiver,
+ excluding sync time unless <xref linkend="guc-wal-sync-method"/> is either
+ <literal>open_datasync</literal> or <literal>open_sync</literal>.
+ Units are in milliseconds with microsecond resolution.
+ This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Number of times WAL files were synced to disk by WAL receiver.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time spent syncing WAL files to disk by WAL receiver,
+ in milliseconds with microsecond resolution. This requires setting
+ <xref linkend="guc-wal-sync-method"/> to one of the "sync at commit"
+ options (i.e., <literal>fdatasync</literal>, <literal>fsync</literal>,
+ or <literal>fsync_writethrough</literal>).
+ This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>stats_reset</structfield> <type>timestamp with time zone</type>
+ </para>
+ <para>
+ Time at which these statistics counters (i.e. <literal>wal_write</literal>,
+ <literal>wal_write_time</literal>, <literal>wal_sync</literal>, and
+ <literal>wal_sync_time</literal> ) were last reset.
+ </para></entry>
+ </row>
</tbody>
</tgroup>
</table>
@@ -3492,9 +3548,13 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<structfield>wal_write</structfield> <type>bigint</type>
</para>
<para>
- Number of times WAL buffers were written out to disk via
+ Number of times WAL buffers were written out to disk by backends and
+ background processes except WAL receiver via
<function>XLogWrite</function>, which is invoked during an
<function>XLogFlush</function> request (see <xref linkend="wal-configuration"/>).
+ The same statistics for WAL receiver is counted in
+ <link linkend="monitoring-pg-stat-wal-receiver-view">
+ <structname>pg_stat_wal_receiver</structname></link>.
</para></entry>
</row>
@@ -3503,10 +3563,14 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<structfield>wal_write_time</structfield> <type>double precision</type>
</para>
<para>
- Total amount of time spent writing WAL data to disk, excluding sync time unless
+ Total amount of time spent writing WAL data to disk by backends and
+ background processes except WAL receiver, excluding sync time unless
<xref linkend="guc-wal-sync-method"/> is either <literal>open_datasync</literal> or
<literal>open_sync</literal>. Units are in milliseconds with microsecond resolution.
This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ The same statistics for WAL receiver is counted in
+ <link linkend="monitoring-pg-stat-wal-receiver-view">
+ <structname>pg_stat_wal_receiver</structname></link>.
</para></entry>
</row>
@@ -3515,12 +3579,16 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<structfield>wal_sync</structfield> <type>bigint</type>
</para>
<para>
- Number of times WAL files were synced to disk via
+ Number of times WAL files were synced to disk by backends and
+ background processes except WAL receiver, via
<function>issue_xlog_fsync</function>, which normally called by an
<function>XLogFlush</function> request (see <xref linkend="wal-configuration"/>),
while <xref linkend="guc-wal-sync-method"/> was set to one of the
"sync at commit" options (i.e., <literal>fdatasync</literal>,
<literal>fsync</literal>, or <literal>fsync_writethrough</literal>).
+ The same statistics for WAL receiver is counted in
+ <link linkend="monitoring-pg-stat-wal-receiver-view">
+ <structname>pg_stat_wal_receiver</structname></link>.
</para></entry>
</row>
@@ -3529,11 +3597,15 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<structfield>wal_sync_time</structfield> <type>double precision</type>
</para>
<para>
- Total amount of time spent syncing WAL files to disk, in milliseconds with microsecond
+ Total amount of time spent syncing WAL files to disk by backends and
+ background processes except WAL receiver, in milliseconds with microsecond
resolution. This requires setting <xref linkend="guc-wal-sync-method"/> to one of
the "sync at commit" options (i.e., <literal>fdatasync</literal>, <literal>fsync</literal>,
or <literal>fsync_writethrough</literal>).
This is zero when <xref linkend="guc-track-wal-io-timing"/> is disabled.
+ The same statistics for WAL receiver is counted in
+ <link linkend="monitoring-pg-stat-wal-receiver-view">
+ <structname>pg_stat_wal_receiver</structname></link>.
</para></entry>
</row>
@@ -5017,7 +5089,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
all the counters shown in
the <structname>pg_stat_bgwriter</structname>
view, <literal>archiver</literal> to reset all the counters shown in
- the <structname>pg_stat_archiver</structname> view or <literal>wal</literal>
+ the <structname>pg_stat_archiver</structname> view, <literal>walreceiver</literal>
+ to reset all the counters (i.e. <literal>wal_write</literal>,
+ <literal>wal_write_time</literal>, <literal>wal_sync</literal>, and
+ <literal>wal_sync_time</literal> ) shown in the
+ <structname>pg_stat_wal_receiver</structname> view, or <literal>wal</literal>
to reset all the counters shown in the <structname>pg_stat_wal</structname> view.
</para>
<para>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 904018ed46..ba467bf41c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10641,9 +10641,17 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
INSTR_TIME_SET_CURRENT(duration);
INSTR_TIME_SUBTRACT(duration, start);
- WalStats.m_wal_sync_time += INSTR_TIME_GET_MICROSEC(duration);
+
+ if (AmWalReceiverProcess())
+ WalReceiverStats.m_wal_sync_time += INSTR_TIME_GET_MICROSEC(duration);
+ else
+ WalStats.m_wal_sync_time += INSTR_TIME_GET_MICROSEC(duration);
}
- WalStats.m_wal_sync++;
+
+ if (AmWalReceiverProcess())
+ WalReceiverStats.m_wal_sync++;
+ else
+ WalStats.m_wal_sync++;
}
}
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index b8ace4fc41..cf4e1f6355 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -837,7 +837,12 @@ CREATE VIEW pg_stat_wal_receiver AS
s.slot_name,
s.sender_host,
s.sender_port,
- s.conninfo
+ s.conninfo,
+ s.wal_write,
+ s.wal_write_time,
+ s.wal_sync,
+ s.wal_sync_time,
+ s.stats_reset
FROM pg_stat_get_wal_receiver() s
WHERE s.pid IS NOT NULL;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 987bbd058d..1a98d1ac53 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -137,11 +137,12 @@ char *pgstat_stat_filename = NULL;
char *pgstat_stat_tmpname = NULL;
/*
- * BgWriter and WAL global statistics counters.
+ * BgWriter, WAL receiver and WAL global statistics counters.
* Stored directly in a stats message structure so they can be sent
* without needing to copy things around. We assume these init to zeroes.
*/
PgStat_MsgBgWriter BgWriterStats;
+PgStat_MsgWalReceiver WalReceiverStats;
PgStat_MsgWal WalStats;
/*
@@ -295,6 +296,7 @@ static int localNumBackends = 0;
*/
static PgStat_ArchiverStats archiverStats;
static PgStat_GlobalStats globalStats;
+static PgStat_WalReceiverStats walReceiverStats;
static PgStat_WalStats walStats;
static PgStat_SLRUStats slruStats[SLRU_NUM_ELEMENTS];
static PgStat_ReplSlotStats *replSlotStats;
@@ -375,6 +377,7 @@ static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
static void pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len);
static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len);
+static void pgstat_recv_walreceiver(PgStat_MsgWalReceiver * msg, int len);
static void pgstat_recv_wal(PgStat_MsgWal *msg, int len);
static void pgstat_recv_slru(PgStat_MsgSLRU *msg, int len);
static void pgstat_recv_funcstat(PgStat_MsgFuncstat *msg, int len);
@@ -1450,13 +1453,15 @@ pgstat_reset_shared_counters(const char *target)
msg.m_resettarget = RESET_ARCHIVER;
else if (strcmp(target, "bgwriter") == 0)
msg.m_resettarget = RESET_BGWRITER;
+ else if (strcmp(target, "walreceiver") == 0)
+ msg.m_resettarget = RESET_WALRECEIVER;
else if (strcmp(target, "wal") == 0)
msg.m_resettarget = RESET_WAL;
else
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("unrecognized reset target: \"%s\"", target),
- errhint("Target must be \"archiver\", \"bgwriter\" or \"wal\".")));
+ errhint("Target must be \"archiver\", \"bgwriter\", \"walreceiver\" or \"wal\".")));
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RESETSHAREDCOUNTER);
pgstat_send(&msg, sizeof(msg));
@@ -2852,6 +2857,22 @@ pgstat_fetch_global(void)
return &globalStats;
}
+/*
+ * ---------
+ * pgstat_fetch_stat_walreceiver() -
+ *
+ * Support function for the SQL-callable pgstat* functions. Returns
+ * a pointer to the WAL receiver statistics struct.
+ * ---------
+ */
+PgStat_WalReceiverStats *
+pgstat_fetch_stat_walreceiver(void)
+{
+ backend_read_statsfile();
+
+ return &walReceiverStats;
+}
+
/*
* ---------
* pgstat_fetch_stat_wal() -
@@ -4666,6 +4687,39 @@ pgstat_send_bgwriter(void)
MemSet(&BgWriterStats, 0, sizeof(BgWriterStats));
}
+/* ----------
+ * pgstat_send_walreceiver() -
+ *
+ * Send wal receiver statistics to the collector
+ * ----------
+ */
+void
+pgstat_send_walreceiver(void)
+{
+ /* We assume this initializes to zeroes */
+ static const PgStat_MsgWalReceiver all_zeroes;
+
+ /*
+ * This function can be called even if nothing at all has happened. In
+ * this case, avoid sending a completely empty message to the stats
+ * collector.
+ */
+ if (memcmp(&WalReceiverStats, &all_zeroes, sizeof(PgStat_MsgWalReceiver)) == 0)
+ return;
+
+ /*
+ * Prepare and send the message
+ */
+ pgstat_setheader(&WalReceiverStats.m_hdr, PGSTAT_MTYPE_WALRECEIVER);
+ pgstat_send(&WalReceiverStats, sizeof(WalReceiverStats));
+
+ /*
+ * Clear out the statistics buffer, so it can be re-used.
+ */
+ MemSet(&WalReceiverStats, 0, sizeof(WalReceiverStats));
+}
+
+
/* ----------
* pgstat_send_wal() -
*
@@ -4961,6 +5015,10 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_bgwriter(&msg.msg_bgwriter, len);
break;
+ case PGSTAT_MTYPE_WALRECEIVER:
+ pgstat_recv_walreceiver(&msg.msg_walreceiver, len);
+ break;
+
case PGSTAT_MTYPE_WAL:
pgstat_recv_wal(&msg.msg_wal, len);
break;
@@ -5249,6 +5307,12 @@ pgstat_write_statsfiles(bool permanent, bool allDbs)
rc = fwrite(&archiverStats, sizeof(archiverStats), 1, fpout);
(void) rc; /* we'll check for error with ferror */
+ /*
+ * Write WAL receiver stats struct
+ */
+ rc = fwrite(&walReceiverStats, sizeof(walReceiverStats), 1, fpout);
+ (void) rc; /* we'll check for error with ferror */
+
/*
* Write WAL stats struct
*/
@@ -5532,6 +5596,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
*/
memset(&globalStats, 0, sizeof(globalStats));
memset(&archiverStats, 0, sizeof(archiverStats));
+ memset(&walReceiverStats, 0, sizeof(walReceiverStats));
memset(&walStats, 0, sizeof(walStats));
memset(&slruStats, 0, sizeof(slruStats));
@@ -5541,6 +5606,7 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
*/
globalStats.stat_reset_timestamp = GetCurrentTimestamp();
archiverStats.stat_reset_timestamp = globalStats.stat_reset_timestamp;
+ walReceiverStats.stat_reset_timestamp = globalStats.stat_reset_timestamp;
walStats.stat_reset_timestamp = globalStats.stat_reset_timestamp;
/*
@@ -5617,6 +5683,17 @@ pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
goto done;
}
+ /*
+ * Read WAL receiver stats struct
+ */
+ if (fread(&walReceiverStats, 1, sizeof(walReceiverStats), fpin) != sizeof(walReceiverStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"", statfile)));
+ memset(&walReceiverStats, 0, sizeof(walReceiverStats));
+ goto done;
+ }
+
/*
* Read WAL stats struct
*/
@@ -5954,6 +6031,7 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
PgStat_StatDBEntry dbentry;
PgStat_GlobalStats myGlobalStats;
PgStat_ArchiverStats myArchiverStats;
+ PgStat_WalReceiverStats myWalReceiverStats;
PgStat_WalStats myWalStats;
PgStat_SLRUStats mySLRUStats[SLRU_NUM_ELEMENTS];
PgStat_ReplSlotStats myReplSlotStats;
@@ -6011,6 +6089,17 @@ pgstat_read_db_statsfile_timestamp(Oid databaseid, bool permanent,
return false;
}
+ /*
+ * Read WAL receiver stats struct
+ */
+ if (fread(&myWalReceiverStats, 1, sizeof(myWalReceiverStats), fpin) != sizeof(myWalReceiverStats))
+ {
+ ereport(pgStatRunningInCollector ? LOG : WARNING,
+ (errmsg("corrupted statistics file \"%s\"", statfile)));
+ FreeFile(fpin);
+ return false;
+ }
+
/*
* Read WAL stats struct
*/
@@ -6619,6 +6708,12 @@ pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len)
memset(&archiverStats, 0, sizeof(archiverStats));
archiverStats.stat_reset_timestamp = GetCurrentTimestamp();
}
+ else if (msg->m_resettarget == RESET_WALRECEIVER)
+ {
+ /* Reset the WAL receiver statistics for the cluster. */
+ memset(&walReceiverStats, 0, sizeof(walReceiverStats));
+ walReceiverStats.stat_reset_timestamp = GetCurrentTimestamp();
+ }
else if (msg->m_resettarget == RESET_WAL)
{
/* Reset the WAL statistics for the cluster. */
@@ -6879,6 +6974,21 @@ pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len)
globalStats.buf_alloc += msg->m_buf_alloc;
}
+/* ----------
+ * pgstat_recv_walreceiver() -
+ *
+ * Process a WALRECEIVER message.
+ * ----------
+ */
+static void
+pgstat_recv_walreceiver(PgStat_MsgWalReceiver * msg, int len)
+{
+ walReceiverStats.wal_write += msg->m_wal_write;
+ walReceiverStats.wal_write_time += msg->m_wal_write_time;
+ walReceiverStats.wal_sync += msg->m_wal_sync;
+ walReceiverStats.wal_sync_time += msg->m_wal_sync_time;
+}
+
/* ----------
* pgstat_recv_wal() -
*
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index eaf5ec9a72..f04e1d99e7 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -773,6 +773,9 @@ WalRcvDie(int code, Datum arg)
/* Ensure that all WAL records received are flushed to disk */
XLogWalRcvFlush(true);
+ /* Send WAL receiver statistics to the stats collector before terminating */
+ pgstat_send_walreceiver();
+
/* Mark ourselves inactive in shared memory */
SpinLockAcquire(&walrcv->mutex);
Assert(walrcv->walRcvState == WALRCV_STREAMING ||
@@ -874,6 +877,7 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
while (nbytes > 0)
{
int segbytes;
+ instr_time start;
if (recvFile < 0 || !XLByteInSeg(recptr, recvSegNo, wal_segment_size))
{
@@ -910,6 +914,13 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
XLogArchiveForceDone(xlogfname);
else
XLogArchiveNotify(xlogfname);
+
+ /*
+ * Send WAL receiver statistics to the stats collector when
+ * finishing the current WAL segment file to avoid overloading
+ * it.
+ */
+ pgstat_send_walreceiver();
}
recvFile = -1;
@@ -931,7 +942,27 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
/* OK to write the logs */
errno = 0;
+ /* Measure I/O timing to write WAL data */
+ if (track_wal_io_timing)
+ INSTR_TIME_SET_CURRENT(start);
+
byteswritten = pg_pwrite(recvFile, buf, segbytes, (off_t) startoff);
+
+ /*
+ * Increment the I/O timing and the number of times WAL data were
+ * written out to disk.
+ */
+ if (track_wal_io_timing)
+ {
+ instr_time duration;
+
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalReceiverStats.m_wal_write_time += INSTR_TIME_GET_MICROSEC(duration);
+ }
+
+ WalReceiverStats.m_wal_write++;
+
if (byteswritten <= 0)
{
char xlogfname[MAXFNAMELEN];
@@ -1317,6 +1348,7 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
int sender_port = 0;
char slotname[NAMEDATALEN];
char conninfo[MAXCONNINFO];
+ PgStat_WalReceiverStats *walreceiver_stats;
/* Take a lock to ensure value consistency */
SpinLockAcquire(&WalRcv->mutex);
@@ -1338,6 +1370,9 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
strlcpy(conninfo, (char *) WalRcv->conninfo, sizeof(conninfo));
SpinLockRelease(&WalRcv->mutex);
+ /* Get statistics about WAL receiver */
+ walreceiver_stats = pgstat_fetch_stat_walreceiver();
+
/*
* No WAL receiver (or not ready yet), just return a tuple with NULL
* values
@@ -1414,6 +1449,17 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS)
nulls[14] = true;
else
values[14] = CStringGetTextDatum(conninfo);
+
+ /* returns WAL I/O activity */
+ values[15] = Int64GetDatum(walreceiver_stats->wal_write);
+
+ /* convert counter from microsec to millisec for display */
+ values[16] = Float8GetDatum((double) walreceiver_stats->wal_write_time / 1000.0);
+ values[17] = Int64GetDatum(walreceiver_stats->wal_sync);
+
+ /* convert counter from microsec to millisec for display */
+ values[18] = Float8GetDatum((double) walreceiver_stats->wal_sync_time / 1000.0);
+ values[19] = TimestampTzGetDatum(walreceiver_stats->stat_reset_timestamp);
}
/* Returns the record as Datum */
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 6962ffeef2..f04ee4a434 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5272,9 +5272,9 @@
{ oid => '3317', descr => 'statistics: information about WAL receiver',
proname => 'pg_stat_get_wal_receiver', proisstrict => 'f', provolatile => 's',
proparallel => 'r', prorettype => 'record', proargtypes => '',
- proallargtypes => '{int4,text,pg_lsn,int4,pg_lsn,pg_lsn,int4,timestamptz,timestamptz,pg_lsn,timestamptz,text,text,int4,text}',
- proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
- proargnames => '{pid,status,receive_start_lsn,receive_start_tli,written_lsn,flushed_lsn,received_tli,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time,slot_name,sender_host,sender_port,conninfo}',
+ proallargtypes => '{int4,text,pg_lsn,int4,pg_lsn,pg_lsn,int4,timestamptz,timestamptz,pg_lsn,timestamptz,text,text,int4,text,int8,float8,int8,float8,timestamptz}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{pid,status,receive_start_lsn,receive_start_tli,written_lsn,flushed_lsn,received_tli,last_msg_send_time,last_msg_receipt_time,latest_end_lsn,latest_end_time,slot_name,sender_host,sender_port,conninfo,wal_write,wal_write_time,wal_sync,wal_sync_time,stats_reset}',
prosrc => 'pg_stat_get_wal_receiver' },
{ oid => '8595', descr => 'statistics: information about replication slots',
proname => 'pg_stat_get_replication_slots', prorows => '10',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 000bb14c0b..fb6bf3282f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -72,6 +72,7 @@ typedef enum StatMsgType
PGSTAT_MTYPE_ANALYZE,
PGSTAT_MTYPE_ARCHIVER,
PGSTAT_MTYPE_BGWRITER,
+ PGSTAT_MTYPE_WALRECEIVER,
PGSTAT_MTYPE_WAL,
PGSTAT_MTYPE_SLRU,
PGSTAT_MTYPE_FUNCSTAT,
@@ -137,6 +138,7 @@ typedef enum PgStat_Shared_Reset_Target
{
RESET_ARCHIVER,
RESET_BGWRITER,
+ RESET_WALRECEIVER,
RESET_WAL
} PgStat_Shared_Reset_Target;
@@ -463,6 +465,21 @@ typedef struct PgStat_MsgBgWriter
PgStat_Counter m_checkpoint_sync_time;
} PgStat_MsgBgWriter;
+/* ----------
+ * PgStat_MsgWalReceiver Sent by wal receiver to update statistics.
+ * ----------
+ */
+typedef struct PgStat_MsgWalReceiver
+{
+ PgStat_MsgHdr m_hdr;
+ PgStat_Counter m_wal_write;
+ PgStat_Counter m_wal_write_time; /* time spend writing wal records in
+ * micro seconds */
+ PgStat_Counter m_wal_sync;
+ PgStat_Counter m_wal_sync_time; /* time spend syncing wal records in micro
+ * seconds */
+} PgStat_MsgWalReceiver;
+
/* ----------
* PgStat_MsgWal Sent by backends and background processes to update WAL statistics.
* ----------
@@ -677,6 +694,7 @@ typedef union PgStat_Msg
PgStat_MsgAnalyze msg_analyze;
PgStat_MsgArchiver msg_archiver;
PgStat_MsgBgWriter msg_bgwriter;
+ PgStat_MsgWalReceiver msg_walreceiver;
PgStat_MsgWal msg_wal;
PgStat_MsgSLRU msg_slru;
PgStat_MsgFuncstat msg_funcstat;
@@ -698,7 +716,7 @@ typedef union PgStat_Msg
* ------------------------------------------------------------
*/
-#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA0
+#define PGSTAT_FILE_FORMAT_ID 0x01A5BCA1
/* ----------
* PgStat_StatDBEntry The collector's data per database
@@ -836,6 +854,18 @@ typedef struct PgStat_GlobalStats
TimestampTz stat_reset_timestamp;
} PgStat_GlobalStats;
+/*
+ * WAL receiver statistics kept in the stats collector
+ */
+typedef struct PgStat_WalReceiverStats
+{
+ PgStat_Counter wal_write;
+ PgStat_Counter wal_write_time;
+ PgStat_Counter wal_sync;
+ PgStat_Counter wal_sync_time;
+ TimestampTz stat_reset_timestamp;
+} PgStat_WalReceiverStats;
+
/*
* WAL statistics kept in the stats collector
*/
@@ -1387,8 +1417,14 @@ extern char *pgstat_stat_filename;
*/
extern PgStat_MsgBgWriter BgWriterStats;
+/*
+ * WAL receiver statistics counter is updated by wal receiver
+ */
+extern PgStat_MsgWalReceiver WalReceiverStats;
+
/*
* WAL statistics counter is updated by backends and background processes
+ * excepting wal receiver because it's counted via WalReceiverStats.
*/
extern PgStat_MsgWal WalStats;
@@ -1600,6 +1636,7 @@ extern void pgstat_twophase_postabort(TransactionId xid, uint16 info,
extern void pgstat_send_archiver(const char *xlog, bool failed);
extern void pgstat_send_bgwriter(void);
+extern void pgstat_send_walreceiver(void);
extern void pgstat_send_wal(void);
/* ----------
@@ -1615,6 +1652,7 @@ extern PgStat_StatFuncEntry *pgstat_fetch_stat_funcentry(Oid funcid);
extern int pgstat_fetch_stat_numbackends(void);
extern PgStat_ArchiverStats *pgstat_fetch_stat_archiver(void);
extern PgStat_GlobalStats *pgstat_fetch_global(void);
+extern PgStat_WalReceiverStats * pgstat_fetch_stat_walreceiver(void);
extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
extern PgStat_SLRUStats *pgstat_fetch_slru(void);
extern PgStat_ReplSlotStats *pgstat_fetch_replslot(int *nslots_p);
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 2ad074f6a0..1bf8e06347 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2178,8 +2178,13 @@ pg_stat_wal_receiver| SELECT s.pid,
s.slot_name,
s.sender_host,
s.sender_port,
- s.conninfo
- FROM pg_stat_get_wal_receiver() s(pid, status, receive_start_lsn, receive_start_tli, written_lsn, flushed_lsn, received_tli, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time, slot_name, sender_host, sender_port, conninfo)
+ s.conninfo,
+ s.wal_write,
+ s.wal_write_time,
+ s.wal_sync,
+ s.wal_sync_time,
+ s.stats_reset
+ FROM pg_stat_get_wal_receiver() s(pid, status, receive_start_lsn, receive_start_tli, written_lsn, flushed_lsn, received_tli, last_msg_send_time, last_msg_receipt_time, latest_end_lsn, latest_end_time, slot_name, sender_host, sender_port, conninfo, wal_write, wal_write_time, wal_sync, wal_sync_time, stats_reset)
WHERE (s.pid IS NOT NULL);
pg_stat_xact_all_tables| SELECT c.oid AS relid,
n.nspname AS schemaname,
--
2.25.1