Re: Sample rate added to pg_stat_statements

Ilia Evdokimov Tue, 07 Jan 2025 14:08:15 -0800


On 07.01.2025 22:29, Sami Imseih wrote:

You are right. This is absolutely unexpected for users. Thank you for
the review.

To fix this, I suggest storing a random number in the [0, 1) range in a
separate variable, which will be used for comparisons in any place. We
will compare 'sample_rate' and random value not only in
pgss_post_parse_analyze(), but also in pgss_ProcessUtility() and
pgss_planner().

I attached patch with test and fixes.

I still think this is not the correct approach. It seems that post_parse_analyze
should not have to deal with checking for a sample rate. This is because
post_parse_analyze, which is the only hook with access to jstate, is
only responsible
for storing a normalized query text on disk and creating a not-yet
user visible entry
in the hash. i.e. pgss_store will never increment counters when called from
pgss_post_parse_analyze.

/* Increment the counts, except when jstate is not NULL */
if (!jstate)
{

What I think may be a better idea is to add something similar
to auto_explain.c, but it should only be added to the top of
pgss_planner, pgss_ExecutorStart and pgss_ProcessUtility.

if (nesting_level == 0)
{
if (!IsParallelWorker())
current_query_sampled = pg_prng_double(&pg_global_prng_state) <
pgss_sample_rate;
else
current_query_sampled = false;
}

This is needed for ExecutorStart and not ExecutorEnd because
ExecutorStart is where the instrumentation is initialized with
queryDesc->totaltime. The above code block could be
turned into a macro and reused in the routines mentioned.

However, it seems with this change, we can see a much
higher likelihood of non-normalized query texts being stored.
This is because jstate is only available during post_parse_analyze.
Therefore, if the first time you are sampling the statement is in ExecutorEnd,
then you will end up storing a non-normalized version of the query text,
see below example with the attached v8.

postgres=# set pg_stat_statements.sample_rate = 0;
SET
postgres=# select pg_stat_statements_reset();
    pg_stat_statements_reset
-------------------------------
  2025-01-07 13:02:11.807952-06
(1 row)

postgres=# SELECT 1 \parse stmt

postgres=# \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
  query | calls
-------+-------
(0 rows)

postgres=# set pg_stat_statements.sample_rate = 1;
SET
postgres=# \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
  query | calls
-------+-------
(0 rows)

postgres=#  \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
                     query                    | calls
---------------------------------------------+-------
  SELECT 1                                    |     1
  SELECT query, calls FROM pg_stat_statements |     1
(2 rows)

postgres=#  \bind_named stmt \g
  ?column?
----------
         1
(1 row)

postgres=# SELECT query, calls FROM pg_stat_statements;
                     query                    | calls
---------------------------------------------+-------
  SELECT 1                                    |     2
  SELECT query, calls FROM pg_stat_statements |     2
(2 rows)

One idea is to make jstate available to all hooks, and completely
remove reliance on post_parse_analyze in pg_stat_statements.
I can't find the thread now, but I know this has come up in past discussions
when troubleshooting gaps in query normalization. My concern is this
feature will greatly increase the likelihood of non-normalized query texts.

Also, with regards to the benchmarks, it seems
sampling will be beneficial for workloads that are touching a small number
of entries with high concurrency. This is why we see benefit for
a standard pgbench workload.
Samping becomes less beneficial when there is a large set of queries
being updated.
I still think this is a good approach for workloads that touch a small set
of entries.

Regards,

Sami

Wow, thank you for pointing this out. Your solution looks interesting,but I'd like to explore other ways to solve the issue besides making itavailable to all hooks. If I don't find anything better, I'll go with yours.


--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Re: Sample rate added to pg_stat_statements

Reply via email to