On Mon, Jun 18, 2018 at 1:42 PM, Claudio Freire <klaussfre...@gmail.com> wrote: > Actually, once btree tids are sorted, you can continue tree descent > all the way to the exact leaf page that contains the tuple to be > deleted. > > Thus, the single-tuple interface ends up being quite OK. Sure, you can > optimize things a bit by scanning a range, but only if vacuum is able > to group keys in order to produce the optimized calls, and I don't see > that terribly likely.
Andrey talked about a background worker that did processing + index tuple deletion when handed off work by a user backend after it performs HOT pruning of a heap page. I consider that idea to be a good place to go with the patch, because in practice the big problem is workloads that suffer from so-called "write amplification", where most index tuples are created despite being "logically unnecessary" (e.g. one index among several prevents an UPDATE being HOT-safe, making inserts into most of the indexes "logically unnecessary"). I think that it's likely that only descending the tree once in order to kill multiple duplicate index tuples in one shot will in fact be *very* important (unless perhaps you assume that that problem is solved by something else, such as zheap). The mechanism that Andrey describes is rather unlike VACUUM as we know it today, but that's the whole point. -- Peter Geoghegan