On Tue, Dec 17, 2024 at 9:11 AM Tomas Vondra <to...@vondra.me> wrote: > I don't follow. How could non-aggressive VACUUM advance relfrozenxid, > ever? I mean, if it doesn't guarantee freezing all pages, how could it?
Although it's very workload dependent, it still happens all the time. Just look at the autovacuum log output from almost any autovacuum that runs when the regression tests run. Or look at the autovacuum output for the small pgbench tables. In general, relfrozenxid simply tracks the oldest possible extant XID in the table. VACUUM doesn't necessarily need to do any freezing to advance relfrozenxid/relminmxid. But VACUUM *must* exhaustively scan every heap page that could possibly contain an old XID in order to be able to advance relfrozenxid/relminmxid safely. > > That's an interesting idea. And it seems like a much more effective > > way of getting some relfrozenxid advancement than hoping that the > > pages you scan due to SKIP_PAGES_THRESHOLD end up being enough to have > > scanned all unfrozen tuples. > But I think that (a) is going to be fairly complex, because how do you > cost the future vacuum?, and (b) is somewhat misses my point that on > modern NVMe SSD storage (SKIP_PAGES_THRESHOLD > 1) doesn't seem to be a > win *ever*. I am not suggesting that the readahead argument for SKIP_PAGES_THRESHOLD is really valid. I think that the relfrozenxid argument is the only one that makes any sense. Clearly both arguments justified the introduction of SKIP_PAGES_THRESHOLD, after the earliest work on the visibility map back in 2009 -- see the commit message for bf136cf6. In short, I am envisaging a design that decides whether or not it'll advance relfrozenxid based on both the costs and the benefits/need. Under this scheme, VACUUM would either scan exactly all all-visible-not-all-frozen pages, or scan none at all. This decision would be almost completely independent of the decision to freeze or not freeze pages (it'd be loosely related because FreezeLimit can never be more than autovacuum_freeze_max_age/2 XIDs in age). Then we'd be free to just get rid of SKIP_PAGES_THRESHOLD, which presumably isn't doing much for readahead. -- Peter Geoghegan