On 20.02.2025 04:04, Sami Imseih wrote:
In my opinion, sample rate is a better fit for pg_stat_statements,
since the queries that
you care about the most are usually the most frequently executed. Sampling them
will still provide enough good data without the risk of not capturing
statistics about
them at all.
Longer running queries will also likely be the least frequent, so they
are already not likely
contributing to the spinlock contention. Also, the least frequent
queries will likely be aged
out faster, so pg_stat_statements was never really a good candidate to
track those anyways;
slow query logging with log_min_duration_statement is a better way to
ensure you capture
the data.
Maybe others may have a different opinion?
--
Sami
Hi everyone,
Since most people have expressed support for sampling based on frequency
of query, I will revert the rebased patch for version v15. Also, given
that everyone is likely busy with the higher-priority task of IN
normalization in pg_stat_statements, I will leave the rebased v19-patch
here and move it back to Ready for Committer, as the patch has already
been it.
Any suggestions are welcome.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
From 9a27fecdfb0a2e351e8476430d5d5030324a887d Mon Sep 17 00:00:00 2001
From: Ilia Evdokimov <ilya.evdoki...@tantorlabs.com>
Date: Mon, 3 Mar 2025 15:41:48 +0300
Subject: [PATCH v19] Allow setting sample rate for pg_stat_statements
New configuration parameter pg_stat_statements.sample_rate makes it
possible to track just a fraction of the queries meeting the configured
threshold, to reduce the amount of tracking.
---
contrib/pg_stat_statements/Makefile | 2 +-
.../pg_stat_statements/expected/sampling.out | 174 ++++++++++++++++++
contrib/pg_stat_statements/meson.build | 1 +
.../pg_stat_statements/pg_stat_statements.c | 53 +++++-
contrib/pg_stat_statements/sql/sampling.sql | 50 +++++
doc/src/sgml/pgstatstatements.sgml | 19 ++
6 files changed, 291 insertions(+), 8 deletions(-)
create mode 100644 contrib/pg_stat_statements/expected/sampling.out
create mode 100644 contrib/pg_stat_statements/sql/sampling.sql
diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 241c02587b..b70bdfaf2d 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -20,7 +20,7 @@ LDFLAGS_SL += $(filter -lm, $(LIBS))
REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf
REGRESS = select dml cursors utility level_tracking planning \
user_activity wal entry_timestamp privileges extended \
- parallel cleanup oldextversions
+ parallel sampling cleanup oldextversions
# Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
# which typical installcheck users do not have (e.g. buildfarm clients).
NO_INSTALLCHECK = 1
diff --git a/contrib/pg_stat_statements/expected/sampling.out b/contrib/pg_stat_statements/expected/sampling.out
new file mode 100644
index 0000000000..2204215f64
--- /dev/null
+++ b/contrib/pg_stat_statements/expected/sampling.out
@@ -0,0 +1,174 @@
+--
+-- sample statements
+--
+-- top-level tracking - simple query protocol
+SHOW pg_stat_statements.track;
+ pg_stat_statements.track
+--------------------------
+ top
+(1 row)
+
+SET pg_stat_statements.sample_rate = 0.0;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+SELECT 1 AS "int";
+ int
+-----
+ 1
+(1 row)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+-------+-------
+(0 rows)
+
+SET pg_stat_statements.sample_rate = 1.0;
+SELECT 1 AS "int";
+ int
+-----
+ 1
+(1 row)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+--------------------+-------
+ SELECT $1 AS "int" | 1
+(1 row)
+
+-- top-level tracking - extended query protocol
+SET pg_stat_statements.sample_rate = 0.0;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+SELECT 1 \parse stmt
+\bind_named stmt \g
+ ?column?
+----------
+ 1
+(1 row)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+-------+-------
+(0 rows)
+
+SET pg_stat_statements.sample_rate = 1.0;
+\bind_named stmt \g
+ ?column?
+----------
+ 1
+(1 row)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query | calls
+-----------+-------
+ SELECT $1 | 1
+(1 row)
+
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+DEALLOCATE stmt;
+-- nested tracking - simple query protocol
+SET pg_stat_statements.track = "all";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+SET pg_stat_statements.sample_rate = 1;
+EXPLAIN (COSTS OFF) SELECT 1;
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT 1;
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+SET pg_stat_statements.sample_rate = 0;
+EXPLAIN (COSTS OFF) SELECT 1;
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+EXPLAIN (COSTS OFF) SELECT 1;
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+SELECT toplevel, calls, query FROM pg_stat_statements
+ ORDER BY query COLLATE "C";
+ toplevel | calls | query
+----------+-------+----------------------------------------------------
+ t | 2 | EXPLAIN (COSTS OFF) SELECT $1
+ f | 2 | SELECT $1
+ t | 1 | SELECT pg_stat_statements_reset() IS NOT NULL AS t
+ t | 2 | SET pg_stat_statements.sample_rate = $1
+(4 rows)
+
+-- nested tracking - extended query protocol
+SET pg_stat_statements.track = "all";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t
+---
+ t
+(1 row)
+
+SET pg_stat_statements.sample_rate = 1;
+EXPLAIN (COSTS OFF) SELECT 1; \parse stmt
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+\bind_named stmt \g
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+\bind_named stmt \g
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+SET pg_stat_statements.sample_rate = 0;
+\bind_named stmt \g
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+\bind_named stmt \g
+ QUERY PLAN
+------------
+ Result
+(1 row)
+
+SELECT toplevel, calls, query FROM pg_stat_statements
+ ORDER BY query COLLATE "C";
+ toplevel | calls | query
+----------+-------+-----------------------------------------
+ t | 2 | EXPLAIN (COSTS OFF) SELECT $1
+ f | 3 | SELECT $1
+ t | 1 | SET pg_stat_statements.sample_rate = $1
+(3 rows)
+
diff --git a/contrib/pg_stat_statements/meson.build b/contrib/pg_stat_statements/meson.build
index 4446af58c5..351354a6a5 100644
--- a/contrib/pg_stat_statements/meson.build
+++ b/contrib/pg_stat_statements/meson.build
@@ -54,6 +54,7 @@ tests += {
'privileges',
'extended',
'parallel',
+ 'sampling',
'cleanup',
'oldextversions',
],
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index b245d04097..8ec379cae3 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -50,6 +50,7 @@
#include "access/parallel.h"
#include "catalog/pg_authid.h"
#include "common/int.h"
+#include "common/pg_prng.h"
#include "executor/instrument.h"
#include "funcapi.h"
#include "jit/jit.h"
@@ -257,6 +258,9 @@ typedef struct pgssSharedState
/* Current nesting depth of planner/ExecutorRun/ProcessUtility calls */
static int nesting_level = 0;
+/* Is the current top-level query to be sampled? */
+static bool is_query_sampled = false;
+
/* Saved hook values */
static shmem_request_hook_type prev_shmem_request_hook = NULL;
static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
@@ -295,12 +299,14 @@ static bool pgss_track_utility = true; /* whether to track utility commands */
static bool pgss_track_planning = false; /* whether to track planning
* duration */
static bool pgss_save = true; /* whether to save stats across shutdown */
+static double pgss_sample_rate = 1.0; /* fraction of statements to track */
-#define pgss_enabled(level) \
+#define pgss_enabled(level, skip_sampling_check) \
(!IsParallelWorker() && \
(pgss_track == PGSS_TRACK_ALL || \
- (pgss_track == PGSS_TRACK_TOP && (level) == 0)))
+ (pgss_track == PGSS_TRACK_TOP && (level) == 0)) && \
+ (skip_sampling_check == PGSS_INVALID || current_query_sampled()))
#define record_gc_qtexts() \
do { \
@@ -374,6 +380,7 @@ static char *generate_normalized_query(JumbleState *jstate, const char *query,
static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
int query_loc);
static int comp_location(const void *a, const void *b);
+static bool current_query_sampled(void);
/*
@@ -415,6 +422,19 @@ _PG_init(void)
NULL,
NULL);
+ DefineCustomRealVariable("pg_stat_statements.sample_rate",
+ "Fraction of queries to track.",
+ NULL,
+ &pgss_sample_rate,
+ 1.0,
+ 0.0,
+ 1.0,
+ PGC_SUSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
DefineCustomEnumVariable("pg_stat_statements.track",
"Selects which statements are tracked by pg_stat_statements.",
NULL,
@@ -837,7 +857,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
prev_post_parse_analyze_hook(pstate, query, jstate);
/* Safety check... */
- if (!pgss || !pgss_hash || !pgss_enabled(nesting_level))
+ if (!pgss || !pgss_hash || !pgss_enabled(nesting_level, PGSS_INVALID))
return;
/*
@@ -895,7 +915,7 @@ pgss_planner(Query *parse,
* pgss_store needs it. We also ignore query without queryid, as it would
* be treated as a utility statement, which may not be the case.
*/
- if (pgss_enabled(nesting_level)
+ if (pgss_enabled(nesting_level, PGSS_PLAN)
&& pgss_track_planning && query_string
&& parse->queryId != UINT64CONST(0))
{
@@ -1006,7 +1026,8 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
* counting of optimizable statements that are directly contained in
* utility statements.
*/
- if (pgss_enabled(nesting_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0))
+ if (pgss_enabled(nesting_level, PGSS_EXEC) &&
+ queryDesc->plannedstmt->queryId != UINT64CONST(0))
{
/*
* Set up to track total elapsed time in ExecutorRun. Make sure the
@@ -1077,7 +1098,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
uint64 queryId = queryDesc->plannedstmt->queryId;
if (queryId != UINT64CONST(0) && queryDesc->totaltime &&
- pgss_enabled(nesting_level))
+ pgss_enabled(nesting_level, PGSS_EXEC))
{
/*
* Make sure stats accumulation is done. (Note: it's okay if several
@@ -1120,7 +1141,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
uint64 saved_queryId = pstmt->queryId;
int saved_stmt_location = pstmt->stmt_location;
int saved_stmt_len = pstmt->stmt_len;
- bool enabled = pgss_track_utility && pgss_enabled(nesting_level);
+ bool enabled = pgss_track_utility && pgss_enabled(nesting_level, PGSS_EXEC);
/*
* Force utility statements to get queryId zero. We do this even in cases
@@ -3025,3 +3046,21 @@ comp_location(const void *a, const void *b)
return pg_cmp_s32(l, r);
}
+
+/*
+ * Determine whether the current query should be sampled.
+ *
+ * At the beginning of each top-level statement, decide whether we'll
+ * sample this statement. If nested-statement tracking is enabled,
+ * either all nested statements will be tracked or none will.
+ */
+static bool
+current_query_sampled(void)
+{
+ if (nesting_level == 0)
+ is_query_sampled = pgss_sample_rate != 0.0 &&
+ (pgss_sample_rate == 1.0 ||
+ pg_prng_double(&pg_global_prng_state) <= pgss_sample_rate);
+
+ return is_query_sampled;
+}
\ No newline at end of file
diff --git a/contrib/pg_stat_statements/sql/sampling.sql b/contrib/pg_stat_statements/sql/sampling.sql
new file mode 100644
index 0000000000..b09f45991b
--- /dev/null
+++ b/contrib/pg_stat_statements/sql/sampling.sql
@@ -0,0 +1,50 @@
+--
+-- sample statements
+--
+
+-- top-level tracking - simple query protocol
+SHOW pg_stat_statements.track;
+SET pg_stat_statements.sample_rate = 0.0;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+SELECT 1 AS "int";
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.sample_rate = 1.0;
+SELECT 1 AS "int";
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+
+-- top-level tracking - extended query protocol
+SET pg_stat_statements.sample_rate = 0.0;
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+SELECT 1 \parse stmt
+\bind_named stmt \g
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+SET pg_stat_statements.sample_rate = 1.0;
+\bind_named stmt \g
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+DEALLOCATE stmt;
+
+-- nested tracking - simple query protocol
+SET pg_stat_statements.track = "all";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+SET pg_stat_statements.sample_rate = 1;
+EXPLAIN (COSTS OFF) SELECT 1;
+EXPLAIN (COSTS OFF) SELECT 1;
+SET pg_stat_statements.sample_rate = 0;
+EXPLAIN (COSTS OFF) SELECT 1;
+EXPLAIN (COSTS OFF) SELECT 1;
+SELECT toplevel, calls, query FROM pg_stat_statements
+ ORDER BY query COLLATE "C";
+
+-- nested tracking - extended query protocol
+SET pg_stat_statements.track = "all";
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+SET pg_stat_statements.sample_rate = 1;
+EXPLAIN (COSTS OFF) SELECT 1; \parse stmt
+\bind_named stmt \g
+\bind_named stmt \g
+SET pg_stat_statements.sample_rate = 0;
+\bind_named stmt \g
+\bind_named stmt \g
+SELECT toplevel, calls, query FROM pg_stat_statements
+ ORDER BY query COLLATE "C";
diff --git a/doc/src/sgml/pgstatstatements.sgml b/doc/src/sgml/pgstatstatements.sgml
index e2ac1c2d50..f517dd855b 100644
--- a/doc/src/sgml/pgstatstatements.sgml
+++ b/doc/src/sgml/pgstatstatements.sgml
@@ -945,6 +945,25 @@
</listitem>
</varlistentry>
+ <varlistentry>
+ <term>
+ <varname>pg_stat_statements.sample_rate</varname> (<type>real</type>)
+ <indexterm>
+ <primary><varname>pg_stat_statements.sample_rate</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+
+ <listitem>
+ <para>
+ <varname>pg_stat_statements.sample_rate</varname> causes pg_stat_statements to only
+ track a fraction of the statements in each session. The default is <literal>1</literal>,
+ meaning track all the queries. Setting this to <literal>0</literal> disables sampled statements
+ tracking, the same as setting <varname>pg_stat_statements.track</varname> to <literal>none</literal>.
+ In case of nested statements, either all will be tracked or none. Only superusers can change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term>
<varname>pg_stat_statements.save</varname> (<type>boolean</type>)
--
2.34.1