Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Robert Haas Tue, 07 Dec 2021 13:59:36 -0800

On Tue, Dec 7, 2021 at 3:44 PM Peter Geoghegan <p...@bowt.ie> wrote:
> Fair enough, but even then we still ultimately have to generate a
> final number that represents how close we are to a configurable "do an
> autovacuum" threshold (such as an autovacuum_vacuum_scale_factor-based
> threshold) -- the autovacuum.c side of this (the consumer side)
> fundamentally needs the model to reduce everything to a one
> dimensional number (even though the reality is that there isn't just
> one dimension). This single number (abstract bloat units, abstract
> dead tuples, whatever) is a function of things like the count of dead
> HOT chains, perhaps the concentration of dead tuples on heap pages,
> whatever -- but it's not the same thing as any one of those things we
> count.
>
> I think that this final number needs to be denominated in abstract
> units -- we need to call those abstract units *something*. I don't
> care what that name ends up being, as long as it reflects reality.


If we're only trying to decide whether or not to vacuum a table, we
don't need units: the output is a Boolean. If we're trying to decide
on an order in which to vacuum tables, then we need units. But such
units can't be anything related to dead tuples, because vacuum can be
needed based on XID age, or MXID age, or dead tuples. The units would
have to be something like abstract vacuum-urgency units (if higher is
more urgent) or abstract remaining-headroom-beform-catastrophe units
(if lower is more urgent).

Ignoring wraparound considerations for a moment, I think that we might
want to consider having multiple thresholds and vacuuming the table if
any one of them are met. For example, suppose a table qualifies for
vacuum when %-of-not-all-visible-pages > some-threshold, or
alternatively when %-of-index-tuples-thought-to-be-dead >
some-other-threshold.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Reply via email to