On Fri, Feb 4, 2022 at 1:54 PM Robert Haas <robertmh...@gmail.com> wrote: > > The constantly modified index will be entirely dependent on index > > vacuuming here, and so an improved VACUUM design that allows that > > particular index to be vacuumed more frequently could really improve > > performance. > > Thanks for checking my work here - I wasn't 100% sure I had the right idea.
I should perhaps have emphasized individual leaf pages, rather than total index size. Presumably we only need to store so many extra versions per logical row at any one time, and we have a fair amount of free space for extra versions on leaf pages. Typically 10%- 30% of the space from the page (typical when it isn't already inevitable that the page will eventually split due to simple inserts). A traditional guarantee with B-Trees is that we get `ln(2)` space utilization with random insertions, which leaves just over 30% of the page free for later updates -- that's where I got 30% from. There is a complementary effect with deduplication, since that buys us time before the page has to split, making it much more likely that the split will be avoided entirely. It's very nonlinear. As I said, the competition between older snapshots and garbage collection can still lead to version-driven page splits (especially when non-hot updates are concentrated in one part of the key space, or one leaf page). But that's arguably a good thing -- it naturally relieves contention. There are actually designs that artificially split B-Tree pages early [1], detecting concurrency control related contention. Other systems need concurrency control in indexes, which we avoid by having versions live in indexes. [1] http://cidrdb.org/cidr2021/papers/cidr2021_paper21.pdf -- Peter Geoghegan