Hi all, I've started this new thread separated from the thread[1] to discuss removing vacuum_cleanup_index_scale_factor GUC parameter proposed by Peter Geoghegan.
btvacuumcleanup() has been playing two roles: recycling deleted pages and collecting index statistics. This discussion focuses on the latter. Since PG11, btvacuumcleanup() uses vacuum_cleanup_index_scale_factor as a threshold to do an index scan to update index statistics (pg_class.reltuples and pg_class.relpages). Therefore, even if there is no update on the btree index at all (e.g., btbulkdelete() was not called earlier), btvacuumcleanup() scans the whole index to collect the index statistics if the number of newly inserted tuples exceeds the vacuum_cleanup_index_scale_factor fraction of the total number of heap tuples detected by the previous statistics collection. On the other hand, those index statistics are updated also by ANALYZE and autoanalyze. pg_class.reltuples calculated by ANALYZE is an estimation whereas the value returned by btvacuumcleanup() is an accurate value. But perhaps we can rely on ANALYZE and autoanalyze to update those index statistics. The points of this discussion are what we really need to do in btvacuumcleanup() and whether btvacuumcleanup() really needs to do an index scan for the purpose of index statistics update. The original design that made VACUUM set pg_class.reltuples/pg_class.relpages in indexes (from 15+ years ago) assumed that it was cheap to handle statistics in passing. Even if we have btvacuumcleanup() not do an index scan at all, this is 100% allowed by the amvacuumcleanup contract described in the documentation: "It is OK to return NULL if the index was not changed at all during the VACUUM operation, but otherwise correct stats should be returned." The above description was added by commit e57345975cf in 2006 and hasn't changed for now. For instance, looking at hash indexes, it hasn't really changed since 2006 in terms of amvacuumcleanup(). hashvacuumcleanup() simply sets stats->num_pages and stats->num_index_tuples without an index scan. I'd like to quote the in-depth analysis by Peter Geoghegan: ----- /* * Post-VACUUM cleanup. * * Result: a palloc'd struct containing statistical info for VACUUM displays. */ IndexBulkDeleteResult * hashvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats) { Relation rel = info->index; BlockNumber num_pages; /* If hashbulkdelete wasn't called, return NULL signifying no change */ /* Note: this covers the analyze_only case too */ if (stats == NULL) return NULL; /* update statistics */ num_pages = RelationGetNumberOfBlocks(rel); stats->num_pages = num_pages; return stats; } Clearly hashvacuumcleanup() was considered by Tom when he revised the documentation in 2006. Here are some observations about hashvacuumcleanup() that seem relevant now: * There is no "analyze_only" handling, just like nbtree. "analyze_only" is only used by GIN, even now, 15+ years after it was added. GIN uses it to make autovacuum workers (never VACUUM outside of an AV worker) do pending list insertions for ANALYZE -- just to make it happen more often. This is a niche thing -- clearly we don't have to care about it in nbtree, even if we make btvacuumcleanup() (almost) always return NULL when there was no btbulkdelete() call. * num_pages (which will become pg_class.relpages for the index) is not set when we return NULL -- hashvacuumcleanup() assumes that ANALYZE will get to it eventually in the case where VACUUM does no real work (when it just returns NULL). * We also use RelationGetNumberOfBlocks() to set pg_class.relpages for index relations during ANALYZE -- it's called when we call vac_update_relstats() (I quoted this do_analyze_rel() code to you directly in a recent email). * In general, pg_class.relpages isn't an estimate (because we use RelationGetNumberOfBlocks(), both in the VACUUM-updates case and the ANALYZE-updates case) -- only pg_class.reltuples is truly an estimate during ANALYZE, and so getting a "true count" seems to have only limited practical importance. I think that this sets a precedent in support of my view that we can simply get rid of vacuum_cleanup_index_scale_factor without any special effort to maintain pg_class.reltuples. As I said before, we can safely make btvacuumcleanup() just like hashvacuumcleanup(), except when there are known deleted-but-not-recycled pages, where a full index scan really is necessary for reasons that are not related to statistics at all (of course we still need the *logic* that was added to nbtree by the vacuum_cleanup_index_scale_factor commit -- that is clearly necessary). My guess is that Tom would have made btvacuumcleanup() look identical to hashvacuumcleanup() in 2006 if nbtree didn't have page deletion to consider -- but that had to be considered. ----- The above discussions make sense to me as a support for the "removing vacuum_cleanup_index_scale_factor GUC" proposal. The difference between the index statistics taken by ANALYZE and btvacuumcleanup() is that the former statistics is always an estimation. That’s calculated by compute_index_stats() whereas the latter uses the result of an index scan. If btvacuumcleanup() doesn’t scan the index and always returns NULL, it would become hard to collect accurate index statistics, for example in a static table case. But if collecting an accurate pg_class.reltuples is not important in practice, I agree that we don't need btvacuumcleanup() to do an index scan for collecting statistics purposes. What do you think about removing vacuum_cleanup_index_scale_factor parameter? and we'd like to ask the design principles of amvacuumcleanup() considered in 2006 by various hackers (mostly Tom). What do you think, Tom? Regards, [1] https://www.postgresql.org/message-id/CAH2-WznUWHOL%2BnYYT2PLsn%2B3OWcq8OBfA1sB3FX885rE%3DZQVvA%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/