> But instead of blindly reducing the frequency via PRNG, we can take a more > thoughtful approach with threshold by execute time:
> Find the most frequent query by column 'calls' in pg_stat_statements; > In this query look at info about execution time: min_exec_time, > max_exec_time, etc; > Gradually increase the threshold from min_exec_time to max_exec_time, > limiting the tracking of this query. > Monitor performance: once the bottleneck is resolved, stop at the current > threshold value. This approach allows us to: > Eliminate the spin-lock bottleneck; > Preserve data about slow queries, which may be critical for performance > analysis; > Reduce the load on the most frequent queries causing contention, instead of > uniformly reducing the frequency for all queries. In my opinion, sample rate is a better fit for pg_stat_statements, since the queries that you care about the most are usually the most frequently executed. Sampling them will still provide enough good data without the risk of not capturing statistics about them at all. Longer running queries will also likely be the least frequent, so they are already not likely contributing to the spinlock contention. Also, the least frequent queries will likely be aged out faster, so pg_stat_statements was never really a good candidate to track those anyways; slow query logging with log_min_duration_statement is a better way to ensure you capture the data. Maybe others may have a different opinion? -- Sami