On 2020-11-12 14:58, Fujii Masao wrote:
On 2020/11/06 10:25, Masahiro Ikeda wrote:
On 2020-10-30 11:50, Fujii Masao wrote:
On 2020/10/29 17:03, Masahiro Ikeda wrote:
Hi,
Thanks for your comments and advice. I updated the patch.
On 2020-10-21 18:03, Kyotaro Horiguchi wrote:
At Tue, 20 Oct 2020 16:11:29 +0900, Masahiro Ikeda
<ikeda...@oss.nttdata.com> wrote in
On 2020-10-20 12:46, Amit Kapila wrote:
> I see that we also need to add extra code to capture these stats (some
> of which is in performance-critical path especially in
> XLogInsertRecord) which again makes me a bit uncomfortable. It might
> be that it is all fine as it is very important to collect these stats
> at cluster-level in spite that the same information can be gathered at
> statement-level to help customers but I don't see a very strong case
> for that in your proposal.
We should avoid that duplication as possible even if the both
number
are important.
Also about performance, I thought there are few impacts because it
increments stats in memory. If I can implement to reuse
pgWalUsage's
value which already collects these stats, there is no impact in
XLogInsertRecord.
For example, how about pg_stat_wal() calculates the accumulated
value of wal_records, wal_fpi, and wal_bytes to use pgWalUsage's
value?
I don't think that works, but it would work that pgstat_send_wal()
takes the difference of that values between two successive calls.
WalUsage prevWalUsage;
...
pgstat_send_wal()
{
..
/* fill in some values using pgWalUsage */
WalStats.m_wal_bytes = pgWalUsage.wal_bytes -
prevWalUsage.wal_bytes;
WalStats.m_wal_records = pgWalUsage.wal_records -
prevWalUsage.wal_records;
WalStats.m_wal_wal_fpi = pgWalUsage.wal_fpi -
prevWalUsage.wal_fpi;
...
pgstat_send(&WalStats, sizeof(WalStats));
/* remember the current numbers */
prevWalUsage = pgWalUsage;
Thanks for Horiguchi-san's advice, I changed to reuse pgWalUsage
which is already defined and eliminates the extra overhead.
+ /* fill in some values using pgWalUsage */
+ WalStats.m_wal_bytes = pgWalUsage.wal_bytes -
prevWalUsage.wal_bytes;
+ WalStats.m_wal_records = pgWalUsage.wal_records -
prevWalUsage.wal_records;
+ WalStats.m_wal_fpi = pgWalUsage.wal_fpi - prevWalUsage.wal_fpi;
It's better to use WalUsageAccumDiff() here?
Yes, thanks. I fixed it.
prevWalUsage needs to be initialized with pgWalUsage?
+ if (AmWalWriterProcess()){
+ WalStats.m_wal_write_walwriter++;
+ }
+ else
+ {
+ WalStats.m_wal_write_backend++;
+ }
I think that it's better not to separate m_wal_write_xxx into two for
walwriter and other processes. Instead, we can use one
m_wal_write_xxx
counter and make pgstat_send_wal() send also the process type to
the stats collector. Then the stats collector can accumulate the
counters
per process type if necessary. If we adopt this approach, we can
easily
extend pg_stat_wal so that any fields can be reported per process
type.
I'll remove the above source code because these counters are not
useful.
On 2020-10-30 12:00, Fujii Masao wrote:
On 2020/10/20 11:31, Masahiro Ikeda wrote:
Hi,
I think we need to add some statistics to pg_stat_wal view.
Although there are some parameter related WAL,
there are few statistics for tuning them.
I think it's better to provide the following statistics.
Please let me know your comments.
```
postgres=# SELECT * from pg_stat_wal;
-[ RECORD 1 ]-------+------------------------------
wal_records | 2000224
wal_fpi | 47
wal_bytes | 248216337
wal_buffers_full | 20954
wal_init_file | 8
wal_write_backend | 20960
wal_write_walwriter | 46
wal_write_time | 51
wal_sync_backend | 7
wal_sync_walwriter | 8
wal_sync_time | 0
stats_reset | 2020-10-20 11:04:51.307771+09
```
1. Basic statistics of WAL activity
- wal_records: Total number of WAL records generated
- wal_fpi: Total number of WAL full page images generated
- wal_bytes: Total amount of WAL bytes generated
To understand DB's performance, first, we will check the performance
trends for the entire database instance.
For example, if the number of wal_fpi becomes higher, users may tune
"wal_compression", "checkpoint_timeout" and so on.
Although users can check the above statistics via EXPLAIN,
auto_explain,
autovacuum and pg_stat_statements now,
if users want to see the performance trends for the entire
database,
they must recalculate the statistics.
I think it is useful to add the sum of the basic statistics.
2. WAL segment file creation
- wal_init_file: Total number of WAL segment files created.
To create a new WAL file may have an impact on the performance of
a write-heavy workload generating lots of WAL. If this number is
reported high,
to reduce the number of this initialization, we can tune WAL-related
parameters
so that more "recycled" WAL files can be held.
3. Number of when WAL is flushed
- wal_write_backend : Total number of WAL data written to the disk
by backends
- wal_write_walwriter : Total number of WAL data written to the disk
by walwriter
- wal_sync_backend : Total number of WAL data synced to the disk by
backends
- wal_sync_walwriter : Total number of WAL data synced to the disk
by walwrite
I think it's useful for tuning "synchronous_commit" and
"commit_delay" for query executions.
If the number of WAL is flushed is high, users can know
"synchronous_commit" is useful for the workload.
I just wonder how useful these counters are. Even without these
counters,
we already know synchronous_commit=off is likely to cause the better
performance (but has the risk of data loss). So ISTM that these
counters are
not so useful when tuning synchronous_commit.
Thanks, my understanding was wrong.
I agreed that your comments.
I merged the statistics of *_backend and *_walwriter.
I think the sum of them is useful to calculate the average per
write/sync time.
For example, per write time is equals wal_write_time / wal_write.
Understood.
Thanks for updating the patch!
Thanks for your comments.
patching file src/include/catalog/pg_proc.dat
Hunk #1 FAILED at 5491.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/catalog/pg_proc.dat.rej
I got this failure when applying the patch. Could you update the patch?
Thanks, I updated the patch.
- Number of times WAL data was written to the disk because WAL
buffers got full
+ Total number of times WAL data written to the disk because WAL
buffers got full
Isn't "was" necessary between "data" and "written"?
Yes, I fixed it.
+ <entry role="catalog_table_entry"><para
role="column_definition">
+ <structfield>wal_bytes</structfield> <type>bigint</type>
Shouldn't the type of wal_bytes be numeric because the total number of
WAL bytes can exceed the range of bigint? I think that the type of
pg_stat_statements.wal_bytes is also numeric for the same reason.
Thanks, I fixed it.
Since I cast the type of wal_bytes from PgStat_Counter to uint64,
I changed the type of PgStat_MsgWal and PgStat_WalStats too.
+ <entry role="catalog_table_entry"><para
role="column_definition">
+ <structfield>wal_write_time</structfield> <type>bigint</type>
Shouldn't the type of wal_xxx_time be double precision,
like pg_stat_database.blk_write_time?
Thanks, I changed it.
Even when fsync is set to off or wal_sync_method is set to open_sync,
wal_sync is incremented. Isn't this behavior confusing?
+ Total amount of time that has been spent in the portion of
+ WAL data was written to disk by backend and walwriter, in
milliseconds
+ (if <xref linkend="guc-track-io-timing"/> is enabled, otherwise
zero)
With the patch, track_io_timing controls both IO for data files and
WAL files. But we may want to track only either of them. So it's better
to extend track_io_timing so that we can specify the tracking target
in the parameter? For example, we can make track_io_timing accept
data, wal and all. Or we should introduce new GUC for WAL, e.g.,
track_wal_io_timing? Thought?
OK, I introduced the new GUC "track_wal_io_timing".
I'm afraid that "by backend and walwriter" part can make us thinkg
incorrectly that WAL writes by other processes like autovacuum
are not tracked.
Sorry, I removed "by backend and walwriter".
Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index f043433e..171d5d59 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7365,6 +7365,27 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
+ <varlistentry id="guc-track-wal-io-timing" xreflabel="track_wal_io_timing">
+ <term><varname>track_wal_io_timing</varname> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>track_wal_io_timing</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Enables timing of WAL I/O calls. This parameter is off by default,
+ because it will repeatedly query the operating system for
+ the current time, which may cause significant overhead on some
+ platforms. You can use the <xref linkend="pgtesttiming"/> tool to
+ measure the overhead of timing on your system.
+ I/O timing information is
+ displayed in <link linkend="monitoring-pg-stat-wal-view">
+ <structname>pg_stat_wal</structname></link>. Only superusers can
+ change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-track-functions" xreflabel="track_functions">
<term><varname>track_functions</varname> (<type>enum</type>)
<indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 98e19954..6d5d58e6 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -3447,12 +3447,88 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
</thead>
<tbody>
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_records</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Total number of WAL records generated
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_fpi</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Total number of WAL full page images generated
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_bytes</structfield> <type>numeric</type>
+ </para>
+ <para>
+ Total amount of WAL bytes generated
+ </para></entry>
+ </row>
+
<row>
<entry role="catalog_table_entry"><para role="column_definition">
<structfield>wal_buffers_full</structfield> <type>bigint</type>
</para>
<para>
- Number of times WAL data was written to the disk because WAL buffers got full
+ Total number of times WAL data was written to the disk because WAL buffers got full
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_init_file</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Total number of times WAL file segment was created
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Total number of times WAL data was written to the disk
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_write_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time that has been spent in the portion of
+ WAL data was written to disk, in milliseconds
+ (if <xref linkend="guc-track-wal-io-timing"/> is enabled, otherwise zero)
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync</structfield> <type>bigint</type>
+ </para>
+ <para>
+ Total number of times WAL data was synced to the disk
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>wal_sync_time</structfield> <type>double precision</type>
+ </para>
+ <para>
+ Total amount of time that has been spent in the portion of
+ WAL data was synced to disk, in milliseconds
+ (if <xref linkend="guc-track-wal-io-timing"/> is enabled, otherwise zero)
</para></entry>
</row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index aa63f376..30442fb1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -109,6 +109,7 @@ int CommitDelay = 0; /* precommit delay in microseconds */
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
int wal_retrieve_retry_interval = 5000;
int max_slot_wal_keep_size_mb = -1;
+bool track_wal_io_timing = false;
#ifdef WAL_DEBUG
bool XLOG_DEBUG = false;
@@ -2527,6 +2528,8 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
Size nbytes;
Size nleft;
int written;
+ instr_time start;
+ instr_time duration;
/* OK to write the page(s) */
from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
@@ -2535,9 +2538,22 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
do
{
errno = 0;
+ if (track_wal_io_timing)
+ INSTR_TIME_SET_CURRENT(start);
+
pgstat_report_wait_start(WAIT_EVENT_WAL_WRITE);
written = pg_pwrite(openLogFile, from, nleft, startoffset);
pgstat_report_wait_end();
+
+ if (track_wal_io_timing)
+ {
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_write_time += INSTR_TIME_GET_MILLISEC(duration);
+ }
+
+ WalStats.m_wal_write++;
+
if (written <= 0)
{
char xlogfname[MAXFNAMELEN];
@@ -3418,6 +3434,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock)
errmsg("could not open file \"%s\": %m", path)));
elog(DEBUG2, "done creating and filling new WAL file");
+ WalStats.m_wal_init_file++;
return fd;
}
@@ -10406,8 +10423,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra)
void
issue_xlog_fsync(int fd, XLogSegNo segno)
{
+ instr_time start;
+ instr_time duration;
char *msg = NULL;
+ if (track_wal_io_timing)
+ INSTR_TIME_SET_CURRENT(start);
+
pgstat_report_wait_start(WAIT_EVENT_WAL_SYNC);
switch (sync_method)
{
@@ -10451,6 +10473,15 @@ issue_xlog_fsync(int fd, XLogSegNo segno)
}
pgstat_report_wait_end();
+
+ if (track_wal_io_timing)
+ {
+ INSTR_TIME_SET_CURRENT(duration);
+ INSTR_TIME_SUBTRACT(duration, start);
+ WalStats.m_wal_sync_time += INSTR_TIME_GET_MILLISEC(duration);
+ }
+
+ WalStats.m_wal_sync++;
}
/*
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2e4aa1c4..897bb59b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -993,7 +993,15 @@ CREATE VIEW pg_stat_bgwriter AS
CREATE VIEW pg_stat_wal AS
SELECT
+ w.wal_records,
+ w.wal_fpi,
+ w.wal_bytes,
w.wal_buffers_full,
+ w.wal_init_file,
+ w.wal_write,
+ w.wal_write_time,
+ w.wal_sync,
+ w.wal_sync_time,
w.stats_reset
FROM pg_stat_get_wal() w;
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index e76e627c..58704ea5 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,7 @@
#include "catalog/pg_database.h"
#include "catalog/pg_proc.h"
#include "common/ip.h"
+#include "executor/instrument.h"
#include "libpq/libpq.h"
#include "libpq/pqsignal.h"
#include "mb/pg_wchar.h"
@@ -142,6 +143,7 @@ char *pgstat_stat_tmpname = NULL;
*/
PgStat_MsgBgWriter BgWriterStats;
PgStat_MsgWal WalStats;
+WalUsage prevWalUsage;
/*
* List of SLRU names that we keep stats for. There is no central registry of
@@ -3048,6 +3050,9 @@ pgstat_initialize(void)
MyBEEntry = &BackendStatusArray[MaxBackends + MyAuxProcType];
}
+ /* Initialize WAL usage status information */
+ prevWalUsage = pgWalUsage;
+
/* Set up a process-exit hook to clean up */
on_shmem_exit(pgstat_beshutdown_hook, 0);
}
@@ -4574,6 +4579,15 @@ pgstat_send_bgwriter(void)
void
pgstat_send_wal(void)
{
+ /* fill in some values using pgWalUsage */
+ WalUsage walusage;
+ memset(&walusage, 0, sizeof(WalUsage));
+ WalUsageAccumDiff(&walusage, &pgWalUsage, &prevWalUsage);
+
+ WalStats.m_wal_bytes = walusage.wal_bytes;
+ WalStats.m_wal_records = walusage.wal_records;
+ WalStats.m_wal_fpi = walusage.wal_fpi;
+
/* We assume this initializes to zeroes */
static const PgStat_MsgWal all_zeroes;
@@ -4591,6 +4605,9 @@ pgstat_send_wal(void)
pgstat_setheader(&WalStats.m_hdr, PGSTAT_MTYPE_WAL);
pgstat_send(&WalStats, sizeof(WalStats));
+ /* remember the current numbers */
+ prevWalUsage = pgWalUsage;
+
/*
* Clear out the statistics buffer, so it can be re-used.
*/
@@ -6759,7 +6776,15 @@ pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len)
static void
pgstat_recv_wal(PgStat_MsgWal *msg, int len)
{
+ walStats.wal_records += msg->m_wal_records;
+ walStats.wal_fpi += msg->m_wal_fpi;
+ walStats.wal_bytes += msg->m_wal_bytes;
walStats.wal_buffers_full += msg->m_wal_buffers_full;
+ walStats.wal_init_file += msg->m_wal_init_file;
+ walStats.wal_write += msg->m_wal_write;
+ walStats.wal_write_time += msg->m_wal_write_time;
+ walStats.wal_sync += msg->m_wal_sync;
+ walStats.wal_sync_time += msg->m_wal_sync_time;
}
/* ----------
diff --git a/src/backend/postmaster/walwriter.c b/src/backend/postmaster/walwriter.c
index a52832fe..ce9f4b7c 100644
--- a/src/backend/postmaster/walwriter.c
+++ b/src/backend/postmaster/walwriter.c
@@ -253,6 +253,9 @@ WalWriterMain(void)
else if (left_till_hibernate > 0)
left_till_hibernate--;
+ /* Send WAL statistics */
+ pgstat_send_wal();
+
/*
* Sleep until we are signaled or WalWriterDelay has elapsed. If we
* haven't done anything useful for quite some time, lengthen the
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index a210fc93..3d1f788d 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -1703,10 +1703,11 @@ pg_stat_get_buf_alloc(PG_FUNCTION_ARGS)
Datum
pg_stat_get_wal(PG_FUNCTION_ARGS)
{
-#define PG_STAT_GET_WAL_COLS 2
+#define PG_STAT_GET_WAL_COLS 10
TupleDesc tupdesc;
Datum values[PG_STAT_GET_WAL_COLS];
bool nulls[PG_STAT_GET_WAL_COLS];
+ char buf[256];
PgStat_WalStats *wal_stats;
/* Initialise values and NULL flags arrays */
@@ -1715,19 +1716,57 @@ pg_stat_get_wal(PG_FUNCTION_ARGS)
/* Initialise attributes information in the tuple descriptor */
tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_WAL_COLS);
- TupleDescInitEntry(tupdesc, (AttrNumber) 1, "wal_buffers_full",
+ TupleDescInitEntry(tupdesc, (AttrNumber) 1, "wal_records",
INT8OID, -1, 0);
- TupleDescInitEntry(tupdesc, (AttrNumber) 2, "stats_reset",
+ TupleDescInitEntry(tupdesc, (AttrNumber) 2, "wal_fpi",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 3, "wal_bytes",
+ NUMERICOID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 4, "wal_buffers_full",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 5, "wal_init_file",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 6, "wal_write",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 7, "wal_write_time",
+ FLOAT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 8, "wal_sync",
+ INT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 9, "wal_sync_time",
+ FLOAT8OID, -1, 0);
+ TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
TIMESTAMPTZOID, -1, 0);
BlessTupleDesc(tupdesc);
+
/* Get statistics about WAL activity */
wal_stats = pgstat_fetch_stat_wal();
/* Fill values and NULLs */
- values[0] = Int64GetDatum(wal_stats->wal_buffers_full);
- values[1] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
+ values[0] = Int64GetDatum(wal_stats->wal_records);
+ values[1] = Int64GetDatum(wal_stats->wal_fpi);
+
+ /* Convert to numeric. */
+ snprintf(buf, sizeof buf, UINT64_FORMAT, wal_stats->wal_bytes);
+ values[2] = DirectFunctionCall3(numeric_in,
+ CStringGetDatum(buf),
+ ObjectIdGetDatum(0),
+ Int32GetDatum(-1));
+
+ values[3] = Int64GetDatum(wal_stats->wal_buffers_full);
+ values[4] = Int64GetDatum(wal_stats->wal_init_file);
+ values[5] = Int64GetDatum(wal_stats->wal_write);
+
+ /* time is already in msec, just convert to double for presentation */
+ values[6] = Float8GetDatum((double) wal_stats->wal_write_time);
+
+ values[7] = Int64GetDatum(wal_stats->wal_sync);
+
+ /* time is already in msec, just convert to double for presentation */
+ values[8] = Float8GetDatum((double) wal_stats->wal_sync_time);
+
+ values[9] = TimestampTzGetDatum(wal_stats->stat_reset_timestamp);
/* Returns the record as Datum */
PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bb34630e..2a3f452f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1482,6 +1482,15 @@ static struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+ {
+ {"track_wal_io_timing", PGC_SUSET, STATS_COLLECTOR,
+ gettext_noop("Collects timing statistics for WAL I/O activity."),
+ NULL
+ },
+ &track_wal_io_timing,
+ false,
+ NULL, NULL, NULL
+ },
{
{"update_process_title", PGC_SUSET, PROCESS_TITLE,
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9cb571f7..46c87584 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -584,6 +584,7 @@
#track_activities = on
#track_counts = on
#track_io_timing = off
+#track_wal_io_timing = off
#track_functions = none # none, pl, all
#track_activity_query_size = 1024 # (change requires restart)
#stats_temp_directory = 'pg_stat_tmp'
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 221af87e..ce695708 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -131,6 +131,7 @@ extern int recovery_min_apply_delay;
extern char *PrimaryConnInfo;
extern char *PrimarySlotName;
extern bool wal_receiver_create_temp_slot;
+extern bool track_wal_io_timing;
/* indirectly set via GUC system */
extern TransactionId recoveryTargetXid;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c01da4bf..4c7b7f5f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5500,12 +5500,13 @@
proname => 'pg_stat_get_buf_alloc', provolatile => 's', proparallel => 'r',
prorettype => 'int8', proargtypes => '', prosrc => 'pg_stat_get_buf_alloc' },
-{ oid => '1136', descr => 'statistics: information about WAL activity',
- proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
- proparallel => 'r', prorettype => 'record', proargtypes => '',
- proallargtypes => '{int8,timestamptz}', proargmodes => '{o,o}',
- proargnames => '{wal_buffers_full,stats_reset}',
- prosrc => 'pg_stat_get_wal' },
+ { oid => '1136', descr => 'statistics: information about WAL activity',
+ proname => 'pg_stat_get_wal', proisstrict => 'f', provolatile => 's',
+ proparallel => 'r', prorettype => 'record', proargtypes => '',
+ proallargtypes => '{int8,int8,numeric,int8,int8,int8,float8,int8,float8,timestamptz}',
+ proargmodes => '{o,o,o,o,o,o,o,o,o,o}',
+ proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,wal_init_file,wal_write,wal_write_time,wal_sync,wal_sync_time,stats_reset}',
+ prosrc => 'pg_stat_get_wal' },
{ oid => '2306', descr => 'statistics: information about SLRU caches',
proname => 'pg_stat_get_slru', prorows => '100', proisstrict => 'f',
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 257e515b..dcc62452 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -12,6 +12,7 @@
#define PGSTAT_H
#include "datatype/timestamp.h"
+#include "executor/instrument.h"
#include "libpq/pqcomm.h"
#include "miscadmin.h"
#include "port/atomics.h"
@@ -459,7 +460,15 @@ typedef struct PgStat_MsgBgWriter
typedef struct PgStat_MsgWal
{
PgStat_MsgHdr m_hdr;
+ PgStat_Counter m_wal_records;
+ PgStat_Counter m_wal_fpi;
+ uint64 m_wal_bytes;
PgStat_Counter m_wal_buffers_full;
+ PgStat_Counter m_wal_init_file;
+ PgStat_Counter m_wal_write;
+ PgStat_Counter m_wal_write_time; /* accumulate times in milliseconds */
+ PgStat_Counter m_wal_sync;
+ PgStat_Counter m_wal_sync_time; /* accumulate times in milliseconds */
} PgStat_MsgWal;
/* ----------
@@ -798,7 +807,16 @@ typedef struct PgStat_GlobalStats
*/
typedef struct PgStat_WalStats
{
+ PgStat_Counter wal_records;
+ PgStat_Counter wal_fpi;
+ uint64 wal_bytes;
PgStat_Counter wal_buffers_full;
+ PgStat_Counter wal_init_file;
+ PgStat_Counter wal_write;
+ PgStat_Counter wal_write_time;
+ PgStat_Counter wal_sync;
+ PgStat_Counter wal_sync_time;
+
TimestampTz stat_reset_timestamp;
} PgStat_WalStats;
@@ -1340,6 +1358,7 @@ extern PgStat_MsgBgWriter BgWriterStats;
* WAL statistics counter is updated by backends and background processes
*/
extern PgStat_MsgWal WalStats;
+extern WalUsage prevWalUsage;
/*
* Updated by pgstat_count_buffer_*_time macros
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 097ff5d1..302ecaa8 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2138,9 +2138,17 @@ pg_stat_user_tables| SELECT pg_stat_all_tables.relid,
pg_stat_all_tables.autoanalyze_count
FROM pg_stat_all_tables
WHERE ((pg_stat_all_tables.schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (pg_stat_all_tables.schemaname !~ '^pg_toast'::text));
-pg_stat_wal| SELECT w.wal_buffers_full,
+pg_stat_wal| SELECT w.wal_records,
+ w.wal_fpi,
+ w.wal_bytes,
+ w.wal_buffers_full,
+ w.wal_init_file,
+ w.wal_write,
+ w.wal_write_time,
+ w.wal_sync,
+ w.wal_sync_time,
w.stats_reset
- FROM pg_stat_get_wal() w(wal_buffers_full, stats_reset);
+ FROM pg_stat_get_wal() w(wal_records, wal_fpi, wal_bytes, wal_buffers_full, wal_init_file, wal_write, wal_write_time, wal_sync, wal_sync_time, stats_reset);
pg_stat_wal_receiver| SELECT s.pid,
s.status,
s.receive_start_lsn,