On 02/22/2013 06:27 PM, Jeff Janes wrote:
On Fri, Feb 22, 2013 at 5:57 AM, Stefan Andreatta
<s.andrea...@synedra.com <mailto:s.andrea...@synedra.com>> wrote:
Hi,
If I understand
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html
correctly, the autovacuum threshold in could be estimated like
this in PostgreSQL 9.1:
SELECT pg_stat_user_tables.relname,
pg_stat_user_tables.n_dead_tup,
CAST(current_setting('autovacuum_vacuum_threshold') AS bigint)
+ (CAST(current_setting('autovacuum_vacuum_scale_factor')
AS numeric)
* pg_class.reltuples) AS av_threshold
FROM pg_stat_user_tables
JOIN pg_class on pg_stat_user_tables.relid = pg_class.oid
ORDER BY 1;
If pg_stat_user_tables.n_dead_tup exceeds av_threshold autovacuum
should kick in. Obviously, that does rely on up-to-date
statistics. Is that how it is actually done?
Pretty much, yes. With the caveat that table storage settings can
override the global settings.
2nd question: because pg_stat_user_tables.n_dead_tup is itself
estimated by ANALYZE it cannot be used as a criterion for the next
autoanalyze run, I think.
n_dead_tup is updated by the stats collector, not by ANALYZE.
Thanks Jeff, that helped a lot (as did a careful rereading of
http://www.postgresql.org/docs/9.1/static/monitoring-stats.html and
http://www.postgresql.org/docs/9.1/static/catalog-pg-class.html ;-)
However, to estimate whether autoanalyze should be triggered, I am still
missing something: the analyze threshold is compared to the "total
number of tuples inserted, updated, or deleted since the last ANALYZE."
(according to
http://www.postgresql.org/docs/9.1/static/routine-vacuuming.html).
pg_stat_user_tables.n_live tup - pg_class.reltuples should give
something like the sum of rows inserted minus rows deleted since the
last ANALYZE. But according to the documentation we would need the sum
of those values. And we are still missing a number for rows updated
since the last analyze. pg_stat_usert_tables. n_dead_tup, on the other
hand, is only set back by successful VACUUM. autoanalyzing a table with
more than 10% dead rows would therefore keep autoanalyze in a loop until
the ratio rises beyond 20% (default configuration) and autovacuum kicks
in. So that wouldn't make a lot of sense.
Regards,
Stefan