On Fri, Feb 4, 2022 at 10:21 PM Greg Stark <st...@mit.edu> wrote: > On Wed, 15 Dec 2021 at 15:30, Peter Geoghegan <p...@bowt.ie> wrote: > > My emphasis here has been on making non-aggressive VACUUMs *always* > > advance relfrozenxid, outside of certain obvious edge cases. And so > > with all the patches applied, up to and including the opportunistic > > freezing patch, every autovacuum of every table manages to advance > > relfrozenxid during benchmarking -- usually to a fairly recent value. > > I've focussed on making aggressive VACUUMs (especially anti-wraparound > > autovacuums) a rare occurrence, for truly exceptional cases (e.g., > > user keeps canceling autovacuums, maybe due to automated script that > > performs DDL). That has taken priority over other goals, for now. > > While I've seen all the above cases triggering anti-wraparound cases > by far the majority of the cases are not of these pathological forms.
Right - it's practically inevitable that you'll need an anti-wraparound VACUUM to advance relfrozenxid right now. Technically it's possible to advance relfrozenxid in any VACUUM, but in practice it just never happens on a large table. You only need to get unlucky with one heap page, either by failing to get a cleanup lock, or (more likely) by setting even one single page all-visible but not all-frozen just once (once in any VACUUM that takes place between anti-wraparound VACUUMs). > By far the majority of anti-wraparound vacuums are triggered by tables > that are very large and so don't trigger regular vacuums for "long > periods" of time and consistently hit the anti-wraparound threshold > first. autovacuum_vacuum_insert_scale_factor can help with this on 13 and 14, but only if you tune autovacuum_freeze_min_age with that goal in mind. Which probably doesn't happen very often. > There's nothing limiting how long "long periods" is and nothing tying > it to the rate of xid consumption. It's quite common to have some > *very* large mostly static tables in databases that have other tables > that are *very* busy. > > The worst I've seen is a table that took 36 hours to vacuum in a > database that consumed about a billion transactions per day... That's > extreme but these days it's quite common to see tables that get > anti-wraparound vacuums every week or so despite having < 1% modified > tuples. And databases are only getting bigger and transaction rates > faster... Sounds very much like what I've been calling the freezing cliff. An anti-wraparound VACUUM throws things off by suddenly dirtying many more pages than the expected amount for a VACUUM against the table, despite there being no change in workload characteristics. If you just had to remove the dead tuples in such a table, then it probably wouldn't matter if it happened earlier than expected. -- Peter Geoghegan