On Wed, Apr 21, 2021 at 5:39 AM Magnus Hagander <mag...@hagander.net> wrote:
> I'm pretty sure everybody would *want* this. At least nobody would be
> against it. The problem is the potential performance cost of it.

VACUUM remembers vacrel->new_live_tuples as the pg_class.reltuples for
the heap relation being vacuumed. It also remembers new_rel_pages in
pg_class (see vac_update_relstats()). However, it does not remember
vacrel->new_dead_tuples in pg_class or in any other durable location
(the information gets remembered via a call to pgstat_report_vacuum()
instead).

We already *almost* pay the full cost of durably storing the
information used by autovacuum.c's relation_needs_vacanalyze() to
determine if a VACUUM is required -- we're only missing
new_dead_tuples/tabentry->n_dead_tuples. Why not go one tiny baby step
further to fix this issue?

Admittedly, storing new_dead_tuples durably is not sufficient to allow
ANALYZE to be launched on schedule when there is a hard crash. It is
also insufficient to make sure that insert-driven autovacuums get
launched on schedule. Even still, I'm pretty sure that just making
sure that we store it durably (alongside pg_class.reltuples?) will
impose only a modest additional cost, while fixing Patrik's problem.
That seems likely to be worth it.

-- 
Peter Geoghegan


Reply via email to