On 29 March 2012 21:05, Tom Lane <t...@sss.pgh.pa.us> wrote: > Barring objections I'll go fix this, and then this patch can be > considered closed except for possible future tweaking of the > sticky-entry decay rule.
Attached patch fixes a bug, and tweaks sticky-entry decay. The extant code bumps usage (though not call counts) in two hooks (pgss_post_parse_analyze() and pgss_ExecutorEnd()) , so prepared queries will always have about half the usage of an equivalent simple query, which is clearly not desirable. With the proposed patch, "usage" should be similar to "calls" until the first call of entry_dealloc(), rather than usually having a value that's about twice as high. With the patch, a run of pgbench with and without "-M prepared" results in a usage of calls + 1 for each query from both runs. The approach I've taken with decay is to maintain a server-wide median usage value (well, a convenient approximation), which is assigned to sticky entries. This makes it hard to evict the entries in the first couple of calls to entry_dealloc(). On the other hand, if there really is contention for entries, it will soon become really easy to evict sticky entries, because we use a much more aggressive multiplier of 0.5 for their decay. I rather conservatively initially assume that the median usage is 10, which is a very low value considering the use of the multiplier trick. In any case, in the real world it won't take too long to call entry_dealloc() to set the median value, if in fact it actually matters. You described entries as precious. This isn't quite the full picture; while pg_stat_statements will malthusianistically burn through pretty much as many entries as you care give to it, or so you might think, I believe that in the real world, the rate at which the module burns through them would frequently look logarithmic. In other words, after an entry_dealloc() call the hashtable is 95% full, but it might take rather a long time to reach 100% again - the first 5% is consumed dramatically faster than the last. The user might not actually care if you need to cache a sticky value for a few hours in one of their slots, as you run an epic reporting query, even though the hashtable is over 95% full. The idea is to avoid evicting a sticky entry just because there happened to be an infrequent entry_dealloc() at the wrong time, and the least marginal of the most marginal 5% of non-sticky entries (that is, the 5% up for eviction) happened to have a call count/usage of higher than the magic value of 3, which I find quite plausible. If I apply your test for dead sticky entries after the regression tests (serial schedule) were run, my approach compares very favourably (granted, presumably usage values were double-counted for your test, making our results less than completely comparable). For the purposes of this experiment, I've just commented out "if (calls == 0) continue;" within the pg_stat_statements() function, obviously: postgres=# select calls = 0, count(*) from pg_stat_statements() group by calls = 0; -[ RECORD 1 ]- ?column? | f count | 959 -[ RECORD 2 ]- ?column? | t count | 3 <--- this includes the above query itself postgres=# select calls = 0, count(*) from pg_stat_statements() group by calls = 0; -[ RECORD 1 ]- ?column? | f count | 960 <----now it's counted here... -[ RECORD 2 ]- ?column? | t count | 2 <---- ...not here I've also attached some elogs, in their original chronological order, that trace the median usage when recorded at entry_dealloc() for the regression tests. As you'd expect given that this is the regression tests, the median is very low, consistently between 1.9 and 2.5. An additional factor that makes this work well is that the standard deviation is low, and as such it is much easier to evict sticky entries, which is what you want here. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.940598 DEBUG: cur_med_usage: 1.921192 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.882960 DEBUG: cur_med_usage: 1.882960 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.901980 DEBUG: cur_med_usage: 1.921192 DEBUG: cur_med_usage: 1.921192 DEBUG: cur_med_usage: 1.921192 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.960200 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 2.110343 DEBUG: cur_med_usage: 2.089240 DEBUG: cur_med_usage: 2.068347 DEBUG: cur_med_usage: 2.047664 DEBUG: cur_med_usage: 2.027187 DEBUG: cur_med_usage: 2.006915 DEBUG: cur_med_usage: 1.986846 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 1.980000 DEBUG: cur_med_usage: 2.110343 DEBUG: cur_med_usage: 2.138837 DEBUG: cur_med_usage: 2.117449 DEBUG: cur_med_usage: 2.117449 DEBUG: cur_med_usage: 2.110343 DEBUG: cur_med_usage: 2.089240 DEBUG: cur_med_usage: 2.110343 DEBUG: cur_med_usage: 2.131660 DEBUG: cur_med_usage: 2.188627 DEBUG: cur_med_usage: 2.166740 DEBUG: cur_med_usage: 2.145073 DEBUG: cur_med_usage: 2.123622 DEBUG: cur_med_usage: 2.102386 DEBUG: cur_med_usage: 2.081362 DEBUG: cur_med_usage: 2.153192 DEBUG: cur_med_usage: 2.138837 DEBUG: cur_med_usage: 2.123622 DEBUG: cur_med_usage: 2.102386 DEBUG: cur_med_usage: 2.081362 DEBUG: cur_med_usage: 2.068347 DEBUG: cur_med_usage: 2.110343 DEBUG: cur_med_usage: 2.219101 DEBUG: cur_med_usage: 2.272166 DEBUG: cur_med_usage: 2.307231 DEBUG: cur_med_usage: 2.289912 DEBUG: cur_med_usage: 2.267013 DEBUG: cur_med_usage: 2.283049 DEBUG: cur_med_usage: 2.260218 DEBUG: cur_med_usage: 2.237616 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.306110 DEBUG: cur_med_usage: 2.310129 DEBUG: cur_med_usage: 2.301419 DEBUG: cur_med_usage: 2.341321 DEBUG: cur_med_usage: 2.429184 DEBUG: cur_med_usage: 2.404892 DEBUG: cur_med_usage: 2.412989 DEBUG: cur_med_usage: 2.444469 DEBUG: cur_med_usage: 2.428392 DEBUG: cur_med_usage: 2.404108 DEBUG: cur_med_usage: 2.380067 DEBUG: cur_med_usage: 2.404892 DEBUG: cur_med_usage: 2.404892 DEBUG: cur_med_usage: 2.400707 DEBUG: cur_med_usage: 2.376700 DEBUG: cur_med_usage: 2.357034 DEBUG: cur_med_usage: 2.357034 DEBUG: cur_med_usage: 2.356266 DEBUG: cur_med_usage: 2.333464 DEBUG: cur_med_usage: 2.310129 DEBUG: cur_med_usage: 2.310129 DEBUG: cur_med_usage: 2.310129 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.301419 DEBUG: cur_med_usage: 2.294729 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.287028 DEBUG: cur_med_usage: 2.283049 DEBUG: cur_med_usage: 2.264158 DEBUG: cur_med_usage: 2.248408 DEBUG: cur_med_usage: 2.241516 DEBUG: cur_med_usage: 2.219564 DEBUG: cur_med_usage: 2.255621 DEBUG: cur_med_usage: 2.333464 DEBUG: cur_med_usage: 2.333464 DEBUG: cur_med_usage: 2.333464 DEBUG: cur_med_usage: 2.380843 DEBUG: cur_med_usage: 2.371093 DEBUG: cur_med_usage: 2.395824 DEBUG: cur_med_usage: 2.420024 DEBUG: cur_med_usage: 2.404892 DEBUG: cur_med_usage: 2.380843 DEBUG: cur_med_usage: 2.395824 DEBUG: cur_med_usage: 2.380843 DEBUG: cur_med_usage: 2.371866 DEBUG: cur_med_usage: 2.357034 DEBUG: cur_med_usage: 2.348147 DEBUG: cur_med_usage: 2.348147 DEBUG: cur_med_usage: 2.348147 DEBUG: cur_med_usage: 2.357034 DEBUG: cur_med_usage: 2.348147 DEBUG: cur_med_usage: 2.380843 DEBUG: cur_med_usage: 2.412285
pg_stat_statements_decay_2012_04_06.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers