On Thu, Mar 11, 2021 at 8:31 AM Robert Haas <robertmh...@gmail.com> wrote: > But even if not, I'm not sure this > helps much with the situation you're concerned about, which involves > non-HOT tuples.
Attached is a POC-quality revision of Masahiko's skip_index_vacuum.patch [1]. There is an improved design for skipping index vacuuming (improved over the INDEX_CLEANUP stuff from Postgres 12). I'm particularly interested in your perspective on this refactoring stuff, Robert, because you ran into the same issues after initial commit of the INDEX_CLEANUP reloption feature [2] -- you ran into issues with the "tupgone = true" special case. This is the case where VACUUM considers a tuple dead that was not marked LP_DEAD by pruning, and so needs to be killed in the second heap scan in lazy_vacuum_heap() instead. You'll recall that these issues were fixed by your commit dd695979888 from May 2019. I think that we need to go further than you did in dd695979888 for this -- we ought to get rid of the special case entirely. ISTM that any new code that skips index vacuuming really ought to be structured as a dynamic version of the "VACUUM (INDEX_CLEANUP OFF)" mechanism. Or vice versa. The important thing is to recognize that they're essentially the same thing, and to structure the code such that they become exactly the same mechanism internally. That's not trivial right now. But removing the awful "tupgone = true" special case seems to buy us a lot -- it makes unifying everything relatively straightforward. In particular, it makes it possible to delay the decision to vacuum indexes until the last moment, which seems essential to making index vacuuming optional. And so I have removed the tupgone/XLOG_HEAP2_CLEANUP_INFO crud in the patch -- that's what all of the changes relate to. This results in a net negative line count, which is a nice bonus! I've CC'd Noah, because my additions to this revision (of Masahiko's patch) are loosely based on an abandoned 2013 patch from Noah [3] -- Noah didn't care for the "tupgone = true" special case either. I think that it's fair to say that Tom doesn't much care for it either [4], or at least was distressed by its lack of test coverage as of a couple of years ago -- which is a problem that still exists today. Honestly, I'm surprised that somebody else hasn't removed the code in question already, long ago -- what possible argument can be made for it now? This patch makes the "VACUUM (INDEX_CLEANUP OFF)" mechanism no longer get invoked as if it was like the "no indexes on table so do it all in one heap pass" optimization. This seems a lot clearer -- INDEX_CLEANUP OFF isn't able to call lazy_vacuum_page() at all (for the obvious reason), so any similarity between the two cases was always superficial -- skipping index vacuuming should not be confused with doing a one-pass VACUUM/having no indexes at all. The original INDEX_CLEANUP structure (from commits a96c41fe and dd695979) always seemed confusing to me for this reason, FWIW. Note that I've merged multiple existing functions in vacuumlazy.c into one: the patch merges lazy_vacuum_all_indexes() and lazy_vacuum_heap() into a single function named vacuum_indexes_mark_unused() (note also that lazy_vacuum_page() has been renamed to mark_unused_page() to reflect the fact that it is now strictly concerned with making LP_DEAD line pointers LP_UNUSED). The big idea is that there is one choke point that decides whether index vacuuming is needed at all at one point in time, dynamically. vacuum_indexes_mark_unused() decides this for us at the last moment. This can only happen during a VACUUM that has enough memory to fit all TIDs -- otherwise we won't skip anything dynamically. We may in the future add additional criteria for skipping index vacuuming. That can now just be added to the beginning of this new vacuum_indexes_mark_unused() function. We may even teach vacuum_indexes_mark_unused() to skip some indexes but not others in a future release, a possibility that was already discussed at length earlier in this thread. This new structure has all the context it needs to do all of these things. I wonder if we can add some kind of emergency anti-wraparound vacuum logic to what I have here, for Postgres 14. Can we come up with logic that has us skip index vacuuming because XID wraparound is on the verge of causing an outage? That seems like a strategically important thing for Postgres, so perhaps we should try to get something like that in. Practically every post mortem blog post involving Postgres also involves anti-wraparound vacuum. One consequence of my approach is that we now call lazy_cleanup_all_indexes(), even when we've skipped index vacuuming itself. We should at least "check-in" with the indexes IMV. To an index AM, this will be indistinguishable from a VACUUM that never had tuples for it to delete, and so never called ambulkdelete() before calling amvacuumcleanup(). This seems logical to me: why should there be any significant behavioral divergence between the case where there are 0 tuples to delete and the case where there is 1 tuple to delete? The extra work that we perform in amvacuumcleanup() (if any) should almost always be a no-op in nbtree following my recent refactoring work. More generally, if an index AM is doing too much during cleanup, and this becomes a bottleneck, then IMV that's a problem that needs to be fixed in the index AM. Masahiko: Note that I've also changed the SKIP_VACUUM_PAGES_RATIO logic to never reset the count of heap blocks with one or more LP_DEAD line pointers, per remarks in a recent email [5] -- that's now a table level count of heap blocks. What do you think of that aspect? (BTW, I pushed your fix for the "not setting has_dead_tuples/has_dead_items variable" issue today, just to get it out of the way.) [1] https://postgr.es/m/CAD21AoAtZb4+HJT_8RoOXvu4HM-Zd4HKS3YSMCH6+-W=bdy...@mail.gmail.com [2] https://postgr.es/m/23885.1555357618%40sss.pgh.pa.us [3] https://postgr.es/m/20130108024957.GA4751%40tornado.leadboat.com [4] https://postgr.es/m/16814.1555348381%40sss.pgh.pa.us [5] https://postgr.es/m/CAH2-Wznpywm4qparkQxf2ns3c7BugC4x7VzKjiB8OCYswwu-=g...@mail.gmail.com -- Peter Geoghegan
v2-0001-Skip-index-vacuuming-dynamically.patch
Description: Binary data