On Fri, Apr 23, 2021 at 1:04 PM Peter Geoghegan <p...@bowt.ie> wrote: > I think that a simple heuristic could work very well here, but it > needs to be at least a little sensitive to the extremes. And I mean > all of the extremes, not just the one from my example -- every > variation exists and will cause problems if given zero weight.
To expand on this a bit, my objection to counting the number of live tuples in the index (as a means to determining how aggressively each individual index needs to be vacuumed) is this: it's driven by positive feedback, not negative feedback. We should focus on *extreme* adverse events (e.g., version-driven page splits) instead. We don't even need to understand ordinary adverse events (e.g., how many dead tuples are in the index). The cost of accumulating dead tuples in an index (could be almost any index AM) grows very slowly at first, and then suddenly explodes (actually it's more like a cascade of correlated explosions, but for the purposes of this explanation that doesn't matter). In a way, this makes life easy for us. The cost of accumulating dead tuples rises so dramatically at a certain inflection point that we can reasonably assume that that's all that matters -- just stop the explosions. An extremely simple heuristic that prevents these extreme adverse events can work very well because that's where almost all of the possible downside is. We can be sure that these extreme adverse events are universally very harmful (workload doesn't matter). Note that the same is not true for an approach driven by positive feedback -- it'll be fragile because it depends on workload characteristics in unfathomably many ways. We should focus on what we can understand with a high degree of confidence. We just need to identify what the extreme adverse event is in each index AM, count them, and focus on those (could be a VACUUM thing, could be local to the index AM like bottom-up deletion is). We need to notice when things are *starting* to go really badly and intervene aggressively. So we need to be willing to try a generic index vacuuming strategy first, and then notice that it has just failed, or is just about to fail. Something like version-driven page splits really shouldn't ever happen, so even a very crude approach will probably work very well. -- Peter Geoghegan