On Fri, Feb 4, 2022 at 10:21 PM Greg Stark <st...@mit.edu> wrote:
> On Wed, 15 Dec 2021 at 15:30, Peter Geoghegan <p...@bowt.ie> wrote:
> > My emphasis here has been on making non-aggressive VACUUMs *always*
> > advance relfrozenxid, outside of certain obvious edge cases. And so
> > with all the patches applied, up to and including the opportunistic
> > freezing patch, every autovacuum of every table manages to advance
> > relfrozenxid during benchmarking -- usually to a fairly recent value.
> > I've focussed on making aggressive VACUUMs (especially anti-wraparound
> > autovacuums) a rare occurrence, for truly exceptional cases (e.g.,
> > user keeps canceling autovacuums, maybe due to automated script that
> > performs DDL). That has taken priority over other goals, for now.
>
> While I've seen all the above cases triggering anti-wraparound cases
> by far the majority of the cases are not of these pathological forms.

Right - it's practically inevitable that you'll need an
anti-wraparound VACUUM to advance relfrozenxid right now. Technically
it's possible to advance relfrozenxid in any VACUUM, but in practice
it just never happens on a large table. You only need to get unlucky
with one heap page, either by failing to get a cleanup lock, or (more
likely) by setting even one single page all-visible but not all-frozen
just once (once in any VACUUM that takes place between anti-wraparound
VACUUMs).

> By far the majority of anti-wraparound vacuums are triggered by tables
> that are very large and so don't trigger regular vacuums for "long
> periods" of time and consistently hit the anti-wraparound threshold
> first.

autovacuum_vacuum_insert_scale_factor can help with this on 13 and 14,
but only if you tune autovacuum_freeze_min_age with that goal in mind.
Which probably doesn't happen very often.

> There's nothing limiting how long "long periods" is and nothing tying
> it to the rate of xid consumption. It's quite common to have some
> *very* large mostly static tables in databases that have other tables
> that are *very* busy.
>
> The worst I've seen is a table that took 36 hours to vacuum in a
> database that consumed about a billion transactions per day... That's
> extreme but these days it's quite common to see tables that get
> anti-wraparound vacuums every week or so despite having < 1% modified
> tuples. And databases are only getting bigger and transaction rates
> faster...

Sounds very much like what I've been calling the freezing cliff. An
anti-wraparound VACUUM throws things off by suddenly dirtying many
more pages than the expected amount for a VACUUM against the table,
despite there being no change in workload characteristics. If you just
had to remove the dead tuples in such a table, then it probably
wouldn't matter if it happened earlier than expected.

-- 
Peter Geoghegan


Reply via email to