Hi, On Wed, Mar 05, 2025 at 05:26:40PM +0000, Bertrand Drouvot wrote: > So yeah, back to the issue, we have to pay more attention for the WAL stats > because pgWalUsage is "incremented" without any check with > pgstat_tracks_backend_bktype() > (that's not the case for the IO stats where the counters are incremented > taking > into account pgstat_tracks_backend_bktype()). > > So for the moment, in the attached I "just" add a > pgstat_tracks_backend_bktype() > check in pgstat_backend_have_pending_cb() but I'm not sure I like it that > much... > > Will think more about it...
After giving more thoughts, I think that the way it's currently done makes sense. There is no need to check pgstat_tracks_backend_bktype() while incrementing pgWalUsage or to create a "BackendpgWalUsage" (that would be incremented based on the pgstat_tracks_backend_bktype()). In pgstat_create_backend() (called based on pgstat_tracks_backend_bktype()): - PendingBackendStats is initialized to zero - prevBackendWalUsage is initialized to pgWalUsage But: 1. PendingBackendStats is "incremented" during IO operations, so that it makes sense to ensure that pgstat_tracks_backend_bktype() returns true in those functions (i.e pgstat_count_backend_io_op_time() and pgstat_count_backend_io_op()). 2. prevBackendWalUsage is not incremented, it's just there to compute the delta with pgWalUsage in pgstat_flush_backend_entry_wal(). pgstat_flush_backend_entry_wal() is only called from pgstat_flush_backend() that does the pgstat_tracks_backend_bktype() before. And so is pgstat_flush_backend_entry_io(). So, I think it's fine the way it's done here. The missing part was in pgstat_backend_have_pending_cb() which has been fixed. But this missing part was already there for the IO stats. It just not manifested yet maybe because PendingBackendStats was full of zeroes by "luck". Indeed, there is no reason for pgstat_backend_have_pending_cb() to return true if pgstat_tracks_backend_bktype() is not satisfied. So it deserves a dedicated patch to fix this already existing issue: 0001 attached. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
>From 38c718f8cd2c079219e5341e72e99664f150a581 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot <bertranddrouvot...@gmail.com> Date: Thu, 6 Mar 2025 09:49:49 +0000 Subject: [PATCH v16 1/2] Add an extra check in pgstat_backend_have_pending_cb() There is no reason for pgstat_backend_have_pending_cb() to not check for pgstat_tracks_backend_bktype(). It could wrongly reports true should PendingBackendStats not be full of zeroes. --- src/backend/utils/activity/pgstat_backend.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) 100.0% src/backend/utils/activity/ diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c index a9343b7b59e..1d94d3176b2 100644 --- a/src/backend/utils/activity/pgstat_backend.c +++ b/src/backend/utils/activity/pgstat_backend.c @@ -222,8 +222,11 @@ pgstat_flush_backend(bool nowait, bits32 flags) bool pgstat_backend_have_pending_cb(void) { - return (!pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))); + if (!pgstat_tracks_backend_bktype(MyBackendType)) + return false; + else + return (!pg_memory_is_all_zeros(&PendingBackendStats, + sizeof(struct PgStat_BackendPending))); } /* -- 2.34.1
>From b3edf7c2bd448dc370847ba119ab9007b5b92b8f Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot <bertranddrouvot...@gmail.com> Date: Mon, 6 Jan 2025 10:00:00 +0000 Subject: [PATCH v16 2/2] per backend WAL statistics Now that commit 9aea73fc61 added backend-level statistics to pgstats (and per backend IO statistics) we can more easily add per backend statistics. This commit adds per backend WAL statistics using the same layer as pg_stat_wal, except that it is now possible to know how much WAL activity is happening in each backend rather than an overall aggregate of all the activity. A function called pg_stat_get_backend_wal() is added to access this data depending on the PID of a backend. The same limitation as in 9aea73fc61 persists, meaning that Auxiliary processes are not included in this set of statistics. XXX: bump catalog version XXX: bump of stats file format not required, as backend stats do not persist on disk. --- doc/src/sgml/monitoring.sgml | 19 ++++++ src/backend/utils/activity/pgstat_backend.c | 70 ++++++++++++++++++++- src/backend/utils/activity/pgstat_wal.c | 1 + src/backend/utils/adt/pgstatfuncs.c | 26 +++++++- src/include/catalog/pg_proc.dat | 7 +++ src/include/pgstat.h | 37 +++++------ src/include/utils/pgstat_internal.h | 3 +- src/test/regress/expected/stats.out | 14 +++++ src/test/regress/sql/stats.sql | 6 ++ 9 files changed, 160 insertions(+), 23 deletions(-) 15.1% doc/src/sgml/ 42.6% src/backend/utils/activity/ 14.7% src/backend/utils/adt/ 8.3% src/include/catalog/ 4.2% src/include/utils/ 8.0% src/test/regress/expected/ 6.1% src/test/regress/sql/ diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 16646f560e8..b1710680705 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4866,6 +4866,25 @@ description | Waiting for a newly initialized WAL file to reach durable storage </para></entry> </row> + <row> + <entry id="pg-stat-get-backend-wal" role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_stat_get_backend_wal</primary> + </indexterm> + <function>pg_stat_get_backend_wal</function> ( <type>integer</type> ) + <returnvalue>record</returnvalue> + </para> + <para> + Returns WAL statistics about the backend with the specified + process ID. The output fields are exactly the same as the ones in the + <structname>pg_stat_wal</structname> view. + </para> + <para> + The function does not return WAL statistics for the checkpointer, + the background writer, the startup process and the autovacuum launcher. + </para></entry> + </row> + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c index 1d94d3176b2..b2c82f37301 100644 --- a/src/backend/utils/activity/pgstat_backend.c +++ b/src/backend/utils/activity/pgstat_backend.c @@ -38,6 +38,14 @@ */ static PgStat_BackendPending PendingBackendStats; +/* + * WAL usage counters saved from pgWalUsage at the previous call to + * pgstat_report_wal(). This is used to calculate how much WAL usage + * happens between pgstat_report_wal() calls, by subtracting + * the previous counters from the current ones. + */ +static WalUsage prevBackendWalUsage; + /* * Utility routines to report I/O stats for backends, kept here to avoid * exposing PendingBackendStats to the outside world. @@ -184,6 +192,57 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref) MemSet(&PendingBackendStats.pending_io, 0, sizeof(PgStat_PendingIO)); } +/* + * To determine whether WAL usage happened. + */ +static bool +pgstat_backend_wal_have_pending(void) +{ + return pgWalUsage.wal_records != prevBackendWalUsage.wal_records; +} + +/* + * Flush out locally pending backend WAL statistics. Locking is managed + * by the caller. + */ +static void +pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref) +{ + PgStatShared_Backend *shbackendent; + PgStat_WalCounters *bktype_shstats; + WalUsage wal_usage_diff = {0}; + + /* + * This function can be called even if nothing at all has happened. Avoid + * taking lock for nothing in that case. + */ + if (!pgstat_backend_wal_have_pending()) + return; + + shbackendent = (PgStatShared_Backend *) entry_ref->shared_stats; + bktype_shstats = &shbackendent->stats.wal_counters; + + /* + * We don't update the WAL usage portion of the local WalStats elsewhere. + * Calculate how much WAL usage counters were increased by subtracting the + * previous counters from the current ones. + */ + WalUsageAccumDiff(&wal_usage_diff, &pgWalUsage, &prevBackendWalUsage); + +#define WALSTAT_ACC(fld, var_to_add) \ + (bktype_shstats->fld += var_to_add.fld) + WALSTAT_ACC(wal_buffers_full, wal_usage_diff); + WALSTAT_ACC(wal_records, wal_usage_diff); + WALSTAT_ACC(wal_fpi, wal_usage_diff); + WALSTAT_ACC(wal_bytes, wal_usage_diff); +#undef WALSTAT_ACC + + /* + * Save the current counters for the subsequent calculation of WAL usage. + */ + prevBackendWalUsage = pgWalUsage; +} + /* * Flush out locally pending backend statistics * @@ -199,7 +258,8 @@ pgstat_flush_backend(bool nowait, bits32 flags) return false; if (pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))) + sizeof(struct PgStat_BackendPending)) + && !pgstat_backend_wal_have_pending()) return false; entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_BACKEND, InvalidOid, @@ -211,6 +271,9 @@ pgstat_flush_backend(bool nowait, bits32 flags) if (flags & PGSTAT_BACKEND_FLUSH_IO) pgstat_flush_backend_entry_io(entry_ref); + if (flags & PGSTAT_BACKEND_FLUSH_WAL) + pgstat_flush_backend_entry_wal(entry_ref); + pgstat_unlock_entry(entry_ref); return false; @@ -226,7 +289,8 @@ pgstat_backend_have_pending_cb(void) return false; else return (!pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))); + sizeof(struct PgStat_BackendPending)) + || pgstat_backend_wal_have_pending()); } /* @@ -261,6 +325,8 @@ pgstat_create_backend(ProcNumber procnum) pgstat_unlock_entry(entry_ref); MemSet(&PendingBackendStats, 0, sizeof(PgStat_BackendPending)); + + prevBackendWalUsage = pgWalUsage; } /* diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c index 5d3da4b674e..16a1ecb4d90 100644 --- a/src/backend/utils/activity/pgstat_wal.c +++ b/src/backend/utils/activity/pgstat_wal.c @@ -52,6 +52,7 @@ pgstat_report_wal(bool force) /* flush wal stats */ (void) pgstat_wal_flush_cb(nowait); + pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL); /* flush IO stats */ pgstat_flush_io(nowait); diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 9172e1cb9d2..662ce46cbc2 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -1609,8 +1609,8 @@ pg_stat_get_backend_io(PG_FUNCTION_ARGS) /* * pg_stat_wal_build_tuple * - * Helper routine for pg_stat_get_wal() returning one tuple based on the - * contents of wal_counters. + * Helper routine for pg_stat_get_wal() and pg_stat_get_backend_wal() + * returning one tuple based on the contents of wal_counters. */ static Datum pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters, @@ -1659,6 +1659,28 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters, PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls))); } +/* + * Returns WAL statistics for a backend with given PID. + */ +Datum +pg_stat_get_backend_wal(PG_FUNCTION_ARGS) +{ + int pid; + PgStat_Backend *backend_stats; + PgStat_WalCounters bktype_stats; + + pid = PG_GETARG_INT32(0); + backend_stats = pgstat_fetch_stat_backend_by_pid(pid, NULL); + + if (!backend_stats) + PG_RETURN_NULL(); + + bktype_stats = backend_stats->wal_counters; + + /* save tuples with data from this PgStat_WalCounters */ + return (pg_stat_wal_build_tuple(bktype_stats, backend_stats->stat_reset_timestamp)); +} + /* * Returns statistics of WAL activity */ diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 134b3dd8689..064d47f63a4 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5954,6 +5954,13 @@ proargmodes => '{o,o,o,o,o}', proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,stats_reset}', prosrc => 'pg_stat_get_wal' }, +{ oid => '8037', descr => 'statistics: backend WAL activity', + proname => 'pg_stat_get_backend_wal', provolatile => 'v', + proparallel => 'r', prorettype => 'record', proargtypes => 'int4', + proallargtypes => '{int4,int8,int8,numeric,int8,timestamptz}', + proargmodes => '{i,o,o,o,o,o}', + proargnames => '{backend_pid,wal_records,wal_fpi,wal_bytes,wal_buffers_full,stats_reset}', + prosrc => 'pg_stat_get_backend_wal' }, { oid => '6248', descr => 'statistics: information about WAL prefetching', proname => 'pg_stat_get_recovery_prefetch', prorows => '1', proretset => 't', provolatile => 'v', prorettype => 'record', proargtypes => '', diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 4aad10b0b6d..06359b9157d 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -340,24 +340,6 @@ typedef struct PgStat_IO PgStat_BktypeIO stats[BACKEND_NUM_TYPES]; } PgStat_IO; -typedef struct PgStat_Backend -{ - TimestampTz stat_reset_timestamp; - PgStat_BktypeIO io_stats; -} PgStat_Backend; - -/* --------- - * PgStat_BackendPending Non-flushed backend stats. - * --------- - */ -typedef struct PgStat_BackendPending -{ - /* - * Backend statistics store the same amount of IO data as PGSTAT_KIND_IO. - */ - PgStat_PendingIO pending_io; -} PgStat_BackendPending; - typedef struct PgStat_StatDBEntry { PgStat_Counter xact_commit; @@ -500,6 +482,25 @@ typedef struct PgStat_WalStats TimestampTz stat_reset_timestamp; } PgStat_WalStats; +typedef struct PgStat_Backend +{ + TimestampTz stat_reset_timestamp; + PgStat_BktypeIO io_stats; + PgStat_WalCounters wal_counters; +} PgStat_Backend; + +/* --------- + * PgStat_BackendPending Non-flushed backend stats. + * --------- + */ +typedef struct PgStat_BackendPending +{ + /* + * Backend statistics store the same amount of IO data as PGSTAT_KIND_IO. + */ + PgStat_PendingIO pending_io; +} PgStat_BackendPending; + /* * Functions in pgstat.c */ diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h index 36d228e3558..d5557e6e998 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -622,7 +622,8 @@ extern void pgstat_archiver_snapshot_cb(void); /* flags for pgstat_flush_backend() */ #define PGSTAT_BACKEND_FLUSH_IO (1 << 0) /* Flush I/O statistics */ -#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO) +#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */ +#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL) extern bool pgstat_flush_backend(bool nowait, bits32 flags); extern bool pgstat_backend_flush_cb(bool nowait); diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index 093e6368dbb..b3c303c98cb 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -832,6 +832,8 @@ SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELEC SELECT num_requested AS rqst_ckpts_before FROM pg_stat_checkpointer \gset -- Test pg_stat_wal (and make a temp table so our temp schema exists) SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset +-- Test pg_stat_get_backend_wal (and make a temp table so our temp schema exists) +SELECT wal_bytes AS backend_wal_bytes_before from pg_stat_get_backend_wal(pg_backend_pid()) \gset CREATE TEMP TABLE test_stats_temp AS SELECT 17; DROP TABLE test_stats_temp; -- Checkpoint twice: The checkpointer reports stats after reporting completion @@ -851,6 +853,18 @@ SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal; t (1 row) +SELECT pg_stat_force_next_flush(); + pg_stat_force_next_flush +-------------------------- + +(1 row) + +SELECT wal_bytes > :backend_wal_bytes_before FROM pg_stat_get_backend_wal(pg_backend_pid()); + ?column? +---------- + t +(1 row) + -- Test pg_stat_get_backend_idset() and some allied functions. -- In particular, verify that their notion of backend ID matches -- our temp schema index. diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index 0a44e14d9f4..ad3f7b7e66a 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -423,6 +423,9 @@ SELECT num_requested AS rqst_ckpts_before FROM pg_stat_checkpointer \gset -- Test pg_stat_wal (and make a temp table so our temp schema exists) SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset +-- Test pg_stat_get_backend_wal (and make a temp table so our temp schema exists) +SELECT wal_bytes AS backend_wal_bytes_before from pg_stat_get_backend_wal(pg_backend_pid()) \gset + CREATE TEMP TABLE test_stats_temp AS SELECT 17; DROP TABLE test_stats_temp; @@ -435,6 +438,9 @@ CHECKPOINT; SELECT num_requested > :rqst_ckpts_before FROM pg_stat_checkpointer; SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal; +SELECT pg_stat_force_next_flush(); +SELECT wal_bytes > :backend_wal_bytes_before FROM pg_stat_get_backend_wal(pg_backend_pid()); + -- Test pg_stat_get_backend_idset() and some allied functions. -- In particular, verify that their notion of backend ID matches -- our temp schema index. -- 2.34.1