Hi, On Mon, Mar 10, 2025 at 04:46:53PM +0900, Michael Paquier wrote: > On Sat, Mar 08, 2025 at 07:53:04AM +0000, Bertrand Drouvot wrote: > > That would not be an issue should we only access the struct > > fields in the code, but that's not the case as we're making use of > > pg_memory_is_all_zeros() on it. > > It does not hurt to keep it as it is, honestly.
I believe that's worse than before actually. Before padding bytes would "probably" be set to zeros while now it's certainly not always the case. I think that we already removed this (see comments === 4 in [1]). > I've reviewed the last patch of the series Thanks! > and noticed a couple of > inconsistent comments across it, and some indentation issue. I think I ran pgindent though. Anyway, thanks for fixing those! > @@ -199,7 +258,8 @@ pgstat_flush_backend(bool nowait, bits32 flags) > return false; > > if (pg_memory_is_all_zeros(&PendingBackendStats, > - sizeof(struct PgStat_BackendPending))) > + sizeof(struct PgStat_BackendPending)) > + && !pgstat_backend_wal_have_pending()) > return false; > > I have one issue with pgstat_flush_backend() and the early exit check > done here. If for example we use FLUSH_WAL but there is some IO data > pending, we would lock the stats entry for nothing. We could also > return true even if there is no pending WAL data if the lock could not > be taken, which would be incorrect because there was no data to flush > to begin with. I think that this should be adjusted so as we limit > the entry lock depending on the flags given in input, like in the > attached. Yeah I agree this needs to be improved, thanks! + /* Some IO data pending? */ + if ((flags & PGSTAT_BACKEND_FLUSH_IO) && + !pg_memory_is_all_zeros(&PendingBackendStats, + sizeof(struct PgStat_BackendPending))) + has_pending_data = true; if PgStat_BackendPending contains more than pending_io in the future, then that would check for zeros in a too large memory region. I think it's better to check for: if (pg_memory_is_all_zeros(&PendingBackendStats.pending_io, sizeof(struct PgStat_PendingIO))) like in the attached. Or check on "backend_has_iostats" (if 0002 in [2] goes in). + /* Some WAL data pending? */ + if ((flags & PGSTAT_BACKEND_FLUSH_WAL) && + pgstat_backend_wal_have_pending()) + has_pending_data = true; I think we can use "else if" here (done in the attached) as it's not needed if has_pending_data is already set to true. That's the only 2 changes done in the attached as compared to the previous version. Regards, [1]: https://www.postgresql.org/message-id/Z44vMD/rALy8pfVE%40ip-10-97-1-34.eu-west-3.compute.internal [2]: https://www.postgresql.org/message-id/Z8WYf1jyy4MwOveQ%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
>From d93548d9b14dfba7c6ef3c8544c441ec29c12361 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Mon, 10 Mar 2025 16:45:26 +0900 Subject: [PATCH v18] per backend WAL statistics Now that commit 9aea73fc61 added backend-level statistics to pgstats (and per backend IO statistics) we can more easily add per backend statistics. This commit adds per backend WAL statistics using the same layer as pg_stat_wal, except that it is now possible to know how much WAL activity is happening in each backend rather than an overall aggregate of all the activity. A function called pg_stat_get_backend_wal() is added to access this data depending on the PID of a backend. The same limitation as in 9aea73fc61 persists, meaning that Auxiliary processes are not included in this set of statistics. XXX: bump catalog version XXX: bump of stats file format not required, as backend stats do not persist on disk. --- doc/src/sgml/monitoring.sgml | 19 +++++ src/backend/utils/activity/pgstat_backend.c | 86 ++++++++++++++++++++- src/backend/utils/activity/pgstat_wal.c | 1 + src/backend/utils/adt/pgstatfuncs.c | 26 ++++++- src/include/catalog/pg_proc.dat | 7 ++ src/include/pgstat.h | 41 +++++----- src/include/utils/pgstat_internal.h | 3 +- src/test/regress/expected/stats.out | 17 +++- src/test/regress/sql/stats.sql | 9 ++- 9 files changed, 183 insertions(+), 26 deletions(-) 13.0% doc/src/sgml/ 46.2% src/backend/utils/activity/ 12.8% src/backend/utils/adt/ 7.2% src/include/catalog/ 3.7% src/include/utils/ 8.4% src/test/regress/expected/ 6.8% src/test/regress/sql/ diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 16646f560e8..b1710680705 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -4866,6 +4866,25 @@ description | Waiting for a newly initialized WAL file to reach durable storage </para></entry> </row> + <row> + <entry id="pg-stat-get-backend-wal" role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>pg_stat_get_backend_wal</primary> + </indexterm> + <function>pg_stat_get_backend_wal</function> ( <type>integer</type> ) + <returnvalue>record</returnvalue> + </para> + <para> + Returns WAL statistics about the backend with the specified + process ID. The output fields are exactly the same as the ones in the + <structname>pg_stat_wal</structname> view. + </para> + <para> + The function does not return WAL statistics for the checkpointer, + the background writer, the startup process and the autovacuum launcher. + </para></entry> + </row> + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/src/backend/utils/activity/pgstat_backend.c b/src/backend/utils/activity/pgstat_backend.c index 6efbb650aa8..f414a48c242 100644 --- a/src/backend/utils/activity/pgstat_backend.c +++ b/src/backend/utils/activity/pgstat_backend.c @@ -38,6 +38,14 @@ */ static PgStat_BackendPending PendingBackendStats = {0}; +/* + * WAL usage counters saved from pgWalUsage at the previous call to + * pgstat_report_wal(). This is used to calculate how much WAL usage + * happens between pgstat_report_wal() calls, by subtracting the previous + * counters from the current ones. + */ +static WalUsage prevBackendWalUsage; + /* * Utility routines to report I/O stats for backends, kept here to avoid * exposing PendingBackendStats to the outside world. @@ -184,6 +192,57 @@ pgstat_flush_backend_entry_io(PgStat_EntryRef *entry_ref) MemSet(&PendingBackendStats.pending_io, 0, sizeof(PgStat_PendingIO)); } +/* + * To determine whether WAL usage happened. + */ +static inline bool +pgstat_backend_wal_have_pending(void) +{ + return pgWalUsage.wal_records != prevBackendWalUsage.wal_records; +} + +/* + * Flush out locally pending backend WAL statistics. Locking is managed + * by the caller. + */ +static void +pgstat_flush_backend_entry_wal(PgStat_EntryRef *entry_ref) +{ + PgStatShared_Backend *shbackendent; + PgStat_WalCounters *bktype_shstats; + WalUsage wal_usage_diff = {0}; + + /* + * This function can be called even if nothing at all has happened for WAL + * statistics. In this case, avoid unnecessarily modifying the stats + * entry. + */ + if (!pgstat_backend_wal_have_pending()) + return; + + shbackendent = (PgStatShared_Backend *) entry_ref->shared_stats; + bktype_shstats = &shbackendent->stats.wal_counters; + + /* + * Calculate how much WAL usage counters were increased by subtracting the + * previous counters from the current ones. + */ + WalUsageAccumDiff(&wal_usage_diff, &pgWalUsage, &prevBackendWalUsage); + +#define WALSTAT_ACC(fld, var_to_add) \ + (bktype_shstats->fld += var_to_add.fld) + WALSTAT_ACC(wal_buffers_full, wal_usage_diff); + WALSTAT_ACC(wal_records, wal_usage_diff); + WALSTAT_ACC(wal_fpi, wal_usage_diff); + WALSTAT_ACC(wal_bytes, wal_usage_diff); +#undef WALSTAT_ACC + + /* + * Save the current counters for the subsequent calculation of WAL usage. + */ + prevBackendWalUsage = pgWalUsage; +} + /* * Flush out locally pending backend statistics * @@ -194,12 +253,22 @@ bool pgstat_flush_backend(bool nowait, bits32 flags) { PgStat_EntryRef *entry_ref; + bool has_pending_data = false; if (!pgstat_tracks_backend_bktype(MyBackendType)) return false; - if (pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))) + /* Some IO data pending? */ + if ((flags & PGSTAT_BACKEND_FLUSH_IO) && + !pg_memory_is_all_zeros(&PendingBackendStats.pending_io, + sizeof(struct PgStat_PendingIO))) + has_pending_data = true; + /* Some WAL data pending? */ + else if ((flags & PGSTAT_BACKEND_FLUSH_WAL) && + pgstat_backend_wal_have_pending()) + has_pending_data = true; + + if (!has_pending_data) return false; entry_ref = pgstat_get_entry_ref_locked(PGSTAT_KIND_BACKEND, InvalidOid, @@ -211,6 +280,9 @@ pgstat_flush_backend(bool nowait, bits32 flags) if (flags & PGSTAT_BACKEND_FLUSH_IO) pgstat_flush_backend_entry_io(entry_ref); + if (flags & PGSTAT_BACKEND_FLUSH_WAL) + pgstat_flush_backend_entry_wal(entry_ref); + pgstat_unlock_entry(entry_ref); return false; @@ -226,7 +298,8 @@ pgstat_backend_have_pending_cb(void) return false; return (!pg_memory_is_all_zeros(&PendingBackendStats, - sizeof(struct PgStat_BackendPending))); + sizeof(struct PgStat_BackendPending)) || + pgstat_backend_wal_have_pending()); } /* @@ -261,6 +334,13 @@ pgstat_create_backend(ProcNumber procnum) pgstat_unlock_entry(entry_ref); MemSet(&PendingBackendStats, 0, sizeof(PgStat_BackendPending)); + + /* + * Initialize prevBackendWalUsage with pgWalUsage so that + * pgstat_backend_flush_cb() can calculate how much pgWalUsage counters + * are increased by subtracting prevBackendWalUsage from pgWalUsage. + */ + prevBackendWalUsage = pgWalUsage; } /* diff --git a/src/backend/utils/activity/pgstat_wal.c b/src/backend/utils/activity/pgstat_wal.c index 5d3da4b674e..16a1ecb4d90 100644 --- a/src/backend/utils/activity/pgstat_wal.c +++ b/src/backend/utils/activity/pgstat_wal.c @@ -52,6 +52,7 @@ pgstat_report_wal(bool force) /* flush wal stats */ (void) pgstat_wal_flush_cb(nowait); + pgstat_flush_backend(nowait, PGSTAT_BACKEND_FLUSH_WAL); /* flush IO stats */ pgstat_flush_io(nowait); diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c index 9172e1cb9d2..662ce46cbc2 100644 --- a/src/backend/utils/adt/pgstatfuncs.c +++ b/src/backend/utils/adt/pgstatfuncs.c @@ -1609,8 +1609,8 @@ pg_stat_get_backend_io(PG_FUNCTION_ARGS) /* * pg_stat_wal_build_tuple * - * Helper routine for pg_stat_get_wal() returning one tuple based on the - * contents of wal_counters. + * Helper routine for pg_stat_get_wal() and pg_stat_get_backend_wal() + * returning one tuple based on the contents of wal_counters. */ static Datum pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters, @@ -1659,6 +1659,28 @@ pg_stat_wal_build_tuple(PgStat_WalCounters wal_counters, PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls))); } +/* + * Returns WAL statistics for a backend with given PID. + */ +Datum +pg_stat_get_backend_wal(PG_FUNCTION_ARGS) +{ + int pid; + PgStat_Backend *backend_stats; + PgStat_WalCounters bktype_stats; + + pid = PG_GETARG_INT32(0); + backend_stats = pgstat_fetch_stat_backend_by_pid(pid, NULL); + + if (!backend_stats) + PG_RETURN_NULL(); + + bktype_stats = backend_stats->wal_counters; + + /* save tuples with data from this PgStat_WalCounters */ + return (pg_stat_wal_build_tuple(bktype_stats, backend_stats->stat_reset_timestamp)); +} + /* * Returns statistics of WAL activity */ diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index cede992b6e2..42e427f8fe8 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -5973,6 +5973,13 @@ proargmodes => '{o,o,o,o,o}', proargnames => '{wal_records,wal_fpi,wal_bytes,wal_buffers_full,stats_reset}', prosrc => 'pg_stat_get_wal' }, +{ oid => '8037', descr => 'statistics: backend WAL activity', + proname => 'pg_stat_get_backend_wal', provolatile => 'v', proparallel => 'r', + prorettype => 'record', proargtypes => 'int4', + proallargtypes => '{int4,int8,int8,numeric,int8,timestamptz}', + proargmodes => '{i,o,o,o,o,o}', + proargnames => '{backend_pid,wal_records,wal_fpi,wal_bytes,wal_buffers_full,stats_reset}', + prosrc => 'pg_stat_get_backend_wal' }, { oid => '6248', descr => 'statistics: information about WAL prefetching', proname => 'pg_stat_get_recovery_prefetch', prorows => '1', proretset => 't', provolatile => 'v', prorettype => 'record', proargtypes => '', diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 4aad10b0b6d..def6b370ac1 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -340,24 +340,6 @@ typedef struct PgStat_IO PgStat_BktypeIO stats[BACKEND_NUM_TYPES]; } PgStat_IO; -typedef struct PgStat_Backend -{ - TimestampTz stat_reset_timestamp; - PgStat_BktypeIO io_stats; -} PgStat_Backend; - -/* --------- - * PgStat_BackendPending Non-flushed backend stats. - * --------- - */ -typedef struct PgStat_BackendPending -{ - /* - * Backend statistics store the same amount of IO data as PGSTAT_KIND_IO. - */ - PgStat_PendingIO pending_io; -} PgStat_BackendPending; - typedef struct PgStat_StatDBEntry { PgStat_Counter xact_commit; @@ -500,6 +482,29 @@ typedef struct PgStat_WalStats TimestampTz stat_reset_timestamp; } PgStat_WalStats; +/* ------- + * PgStat_Backend Backend statistics + * ------- + */ +typedef struct PgStat_Backend +{ + TimestampTz stat_reset_timestamp; + PgStat_BktypeIO io_stats; + PgStat_WalCounters wal_counters; +} PgStat_Backend; + +/* --------- + * PgStat_BackendPending Non-flushed backend stats. + * --------- + */ +typedef struct PgStat_BackendPending +{ + /* + * Backend statistics store the same amount of IO data as PGSTAT_KIND_IO. + */ + PgStat_PendingIO pending_io; +} PgStat_BackendPending; + /* * Functions in pgstat.c */ diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h index 36d228e3558..d5557e6e998 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -622,7 +622,8 @@ extern void pgstat_archiver_snapshot_cb(void); /* flags for pgstat_flush_backend() */ #define PGSTAT_BACKEND_FLUSH_IO (1 << 0) /* Flush I/O statistics */ -#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO) +#define PGSTAT_BACKEND_FLUSH_WAL (1 << 1) /* Flush WAL statistics */ +#define PGSTAT_BACKEND_FLUSH_ALL (PGSTAT_BACKEND_FLUSH_IO | PGSTAT_BACKEND_FLUSH_WAL) extern bool pgstat_flush_backend(bool nowait, bits32 flags); extern bool pgstat_backend_flush_cb(bool nowait); diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out index 30d763c4aee..f77caacc17d 100644 --- a/src/test/regress/expected/stats.out +++ b/src/test/regress/expected/stats.out @@ -908,8 +908,11 @@ SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELEC -- Test pg_stat_checkpointer checkpointer-related stats, together with pg_stat_wal SELECT num_requested AS rqst_ckpts_before FROM pg_stat_checkpointer \gset --- Test pg_stat_wal (and make a temp table so our temp schema exists) +-- Test pg_stat_wal SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset +-- Test pg_stat_get_backend_wal() +SELECT wal_bytes AS backend_wal_bytes_before from pg_stat_get_backend_wal(pg_backend_pid()) \gset +-- Make a temp table so our temp schema exists CREATE TEMP TABLE test_stats_temp AS SELECT 17; DROP TABLE test_stats_temp; -- Checkpoint twice: The checkpointer reports stats after reporting completion @@ -929,6 +932,18 @@ SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal; t (1 row) +SELECT pg_stat_force_next_flush(); + pg_stat_force_next_flush +-------------------------- + +(1 row) + +SELECT wal_bytes > :backend_wal_bytes_before FROM pg_stat_get_backend_wal(pg_backend_pid()); + ?column? +---------- + t +(1 row) + -- Test pg_stat_get_backend_idset() and some allied functions. -- In particular, verify that their notion of backend ID matches -- our temp schema index. diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql index 5e7ef20fef6..c223800fd19 100644 --- a/src/test/regress/sql/stats.sql +++ b/src/test/regress/sql/stats.sql @@ -426,9 +426,13 @@ SELECT sessions > :db_stat_sessions FROM pg_stat_database WHERE datname = (SELEC -- Test pg_stat_checkpointer checkpointer-related stats, together with pg_stat_wal SELECT num_requested AS rqst_ckpts_before FROM pg_stat_checkpointer \gset --- Test pg_stat_wal (and make a temp table so our temp schema exists) +-- Test pg_stat_wal SELECT wal_bytes AS wal_bytes_before FROM pg_stat_wal \gset +-- Test pg_stat_get_backend_wal() +SELECT wal_bytes AS backend_wal_bytes_before from pg_stat_get_backend_wal(pg_backend_pid()) \gset + +-- Make a temp table so our temp schema exists CREATE TEMP TABLE test_stats_temp AS SELECT 17; DROP TABLE test_stats_temp; @@ -441,6 +445,9 @@ CHECKPOINT; SELECT num_requested > :rqst_ckpts_before FROM pg_stat_checkpointer; SELECT wal_bytes > :wal_bytes_before FROM pg_stat_wal; +SELECT pg_stat_force_next_flush(); +SELECT wal_bytes > :backend_wal_bytes_before FROM pg_stat_get_backend_wal(pg_backend_pid()); + -- Test pg_stat_get_backend_idset() and some allied functions. -- In particular, verify that their notion of backend ID matches -- our temp schema index. -- 2.34.1