minimizing pg_stat_statements performance overhead

2019-03-26 Thread Raymond Martin
Hello hackers, 
This email is regarding the Postgres pg_stat_statements extension. 
I noticed that enabling pg_stat_statements can effect performance. I thought 
that changing the pg_stat_statements.track parameter to 'none' could reduce 
this overhead without requiring a restart to remove it from 
shared_preload_libraries. Changing this config did not improve performance as I 
expected. Looking over the code, I noticed that pg_stat_statements is not 
checking if it is enabled before executing the post_parse_analyze_hook 
function. Other hooks that require access to the pg_stat_statements query hash 
table (through the pgss_store function) check for pgss_enabled. 
Would it make sense to check for pgss_enabled in the post_parse_analyze_hook 
function?
 
**Patching**
Making this change drastically improved performance while 
pg_stat_statement.track was set to NONE. This change allows me to more 
effectively enable/disable pg_stat_statements without requiring a restart. 
Example patch:
@@ -783,8 +783,8 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query)
    /* Assert we didn't do this already */
    Assert(query->queryId == 0);
 
-   /* Safety check... */
-   if (!pgss || !pgss_hash)
+   /* Safety check...and ensure that pgss is enabled before we do any work 
*/
+   if (!pgss || !pgss_hash || !pgss_enabled())
    return;

**Simple Mini Benchmark**
I ran a simple test on my local machine with this spec: 16 core/32 GB 
memory/Windows Server 2016.
The simple query I used was 'select 1'. I called pg_stat_statements_reset() 
before each simple query to clear the pg_stat_statements query hash. The 
majority of the latency happens the first time a query is run. 
Median runtime of 100 simple queries in milliseconds: 
PGSS loaded (ms)PGSS loaded + this patch (ms)
track = top  0.53   0.55
track = none   0.41 0.20

PGSS not loaded: 0.18ms

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL





RE: minimizing pg_stat_statements performance overhead

2019-03-27 Thread Raymond Martin
Hi Pascal,
Thanks for your feedback! I like your ideas. 

>the part that hurts in terms or performances is:
>
>   if (jstate.clocations_count > 0)
>   pgss_store(pstate->p_sourcetext,

I agree that this is the typically the most expensive part, but query 
normalization and hashing can also start becoming expensive with larger 
queries.  

>that writes the query text to disk, when it has at less one parameter ...
>Computing the QueryId should stay very small and can be very usefull when used 
>in conjonction with
>https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql-archive.org%2FFeature-improvement-can-we-add-queryId-for-pg-catalog-pg-stat-activity-view-td6077275.html%23a6077602&data=02%7C01%7Cramarti%40microsoft.com%7Cfaa866abf1d5478e9a2208d6b2887cc4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636892696615564063&sdata=pWbyVleHceAHoNTMzb5oHGzois5yDaMpEHKmappTIwk%3D&reserved=0
>for wait events sampling.

I also agree that the query id can be very useful! In cases where query id is 
required, pg_stat_statements can be enabled. 
My intent of turning tracking off is to minimize the performance impact of pgss 
as much as possible and the thread above states: "PGSS jumble query logic is 
not bullet proof and may take time then impact the perf".

I believe your fix is a great step forward, but checking enabled at the very 
beginning would lead to better performance. This also follows the paradigm set 
by the rest of the pg_stat_statements codebase.
In the future, if we want only the query ID to be calculated maybe we can add 
another option for that?

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL






RE: minimizing pg_stat_statements performance overhead

2019-03-27 Thread Raymond Martin
Hi Fabien, 
Thank you for your time. Apologies for not being more specific about my testing 
methodology.

> > PGSS not loaded: 0.18ms
>
> This means 0.0018 ms latency per transaction, which seems rather fast, on my 
> laptop I have typically 0.0XX ms...

This actually means 0.18 milliseconds. I agree that this is a bit high, so I 
instead created an Ubuntu VM to get results that would align with yours. 

> I could not reproduce these results on my ubuntu laptop. Could you be more 
> precise about the test? Did you use pgbench? Did it run in parallel? What 
> options were used? What is the test script?

I did not use pgbench. It is important to call pg_stat_statements_reset before 
every query. This simulates a user that is performing distinct and non-repeated 
queries on their database. If you use prepared statements or the same set of 
queries each time, you would remove the contention on the pgss query text file. 
I re-tested this on an Ubuntu machine with 4cores and 14GB ram. I did not run 
it in parallel. I used a python script that implements the follow logic: 
- select pg_stat_statements_reset() -- this is important because we are 
making pgss treat the 'select 1' like a new query which it has not cached into 
pgss_hash. 
- time 'select 1'
Repeat 100 times for each configuration. 

Here are my Ubuntu results:
  pgss unloaded
  Mean: 0.076
  Standard Deviation: 0.038

  pgss.track=none
  Mean: 0.099
  Standard Deviation: 0.040
 
  pgss.track=top 
  Mean: 0.098
  Standard Deviation: 0.107

  pgss.track=none + patch
  Mean: 0.078
  Standard Deviation: 0.042

The results are less noticeable, but I still see about a 20% performance 
improvement here.

> There I have an impact of 10% in these ideal testing conditions wrt latency 
> where the DB does basically nothing, thus which would not warrant to disable 
> pg_stat_statements given the great service this extension brings to 
> performance analysis.

I agree that pg_stat_statements should not be disabled based on these 
performance results. 

> Note that this does not mean that the patch should not be applied, it looks 
> like an oversight, but really I do not have the performance degradation you 
> are suggesting.

I appreciate your input and I want to come up with a canonical test that makes 
this contention more obvious. 
Unfortunately, it is difficult because the criteria that causes this slow down 
(large query sizes and distinct non-repeated queries) are difficult to 
reproduce with pgbench. I would be open to any suggestions here. 

So even though the performance gains in this specific scenario are not as 
great, do you still think it would make sense to submit a patch like this? 

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL




RE: minimizing pg_stat_statements performance overhead

2019-04-01 Thread Raymond Martin
Hi Fabien, 

> Sure, it definitely makes sense to reduce the overhead when the extension is 
> disabled. I wanted to understand the source of performance issue, and your 
> explanations where not enough for reproducing it.
Thanks again Fabien. I am attaching the patch to this email in the hope of 
getting it approved during the next commit fest. 
I will continue trying to find a simple performance test to exemplify the 
performance degradation that I have seen with more complex workloads. 

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL



check_pgss_enabled.patch
Description: check_pgss_enabled.patch


RE: minimizing pg_stat_statements performance overhead

2019-04-03 Thread Raymond Martin
Hi Christoph, 

> you sent the patch as UTF-16, could you re-send it as plain ascii?

Apologies. I re-attached the plain ascii version. 

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL



check_pgss_enabled.patch
Description: check_pgss_enabled.patch


RE: minimizing pg_stat_statements performance overhead

2019-04-03 Thread Raymond Martin
From: Robert Haas
>> 
>> One thing that needs some thought here is what happens if the value of
>> pgss_enabled() changes.  For example we don't want a situation where 
>> if the value changes from off to on between one stage of processing 
>> and another, the server crashes.
>> 
>> I don't know whether that's a risk here or not; it's just something to 
>> think about.
This is definitely an important consideration for this change. A hook could 
have the implicit assumption that a query ID is always generated.  

From: PAscal
> Hi, here is a simple test where I commented that line in 
> pgss_post_parse_analyze
>  to force return; (as if pgss_enabled() was disabled) but kept pgss_enabled() 
> every where else 
>
>   /* Safety check... */
> //if (!pgss || !pgss_hash || !pgss_enabled())
>   return;
>
> This works without crash as you can see here after:

In theory, the rest of the hooks look solid.
As mentioned above, I think the major concern would be a hook that depends 
on a variable generated in pgss_post_parse_analyze. Two hooks 
(pgss_ExecutorStart, pgss_ExecutorEnd) depend on the query ID 
generated from pgss_post_parse_analyze. Fortunately, both of these 
functions already check for query ID before doing work.

I really appreciate you putting this change into practice. 
Great to see your results align with mine. 
Thanks Pascal!!!

--
Raymond Martin
rama...@microsoft.com
Azure Database for PostgreSQL




Re: minimizing pg_stat_statements performance overhead

2019-04-15 Thread Raymond Martin
Hi, 
Apologies, but I had already created a commit here: 
https://commitfest.postgresql.org/23/2080/ . 
Any preference on which to keep?

Thanks, 
Raymond Martin 
rama...@microsoft.com