Re: Deleting older versions in unique indexes to avoid page splits

Peter Geoghegan Mon, 12 Oct 2020 14:46:23 -0700

On Mon, Oct 12, 2020 at 3:47 AM Andrey Borodin <x4...@yandex-team.ru> wrote:
> The idea looks very interesting.
> It resembles GiST microvacuum: GiST tries to vacuum single page before split.

AFAICT the GiST microvacuum mechanism is based on the one in nbtree,
which is based on setting LP_DEAD bits when index scans find that the
TIDs are dead-to-all. That's easy to justify, because it is easy and
cheap to save future index scans the trouble of following the TIDs
just to discover the same thing for themselves.

The difference here is that we're simply making an intelligent guess
-- there have been no index scans, but we're going to do a kind of
special index scan at the last minute to see if we can avoid a page
split. I think that this is okay because in practice we can do it in a
reasonably targeted way. We will never do it with INSERTs, for example
(except in unique indexes, though only when we know that there are
duplicates because we saw them in _bt_check_unique()). In the worst
case we make non-HOT updates a bit slower...and I'm not even sure that
that's a bad thing when we don't manage to delete anything. After all,
non-HOT updates impose a huge cost on the system. They have "negative
externalities".

Another way to look at it is that the mechanism I propose to add takes
advantage of "convexity" [1]. (Actually, several of my indexing ideas
are based on similar principles -- like the nbtsplitloc.c stuff.)

Attached is v2. It controls the cost of visiting the heap by finding
the heap page that has the most TIDs that we might be able to kill
(among those that are duplicates on a leaf page). It also adds a
hinting mechanism to the executor to avoid uselessly applying the
optimization with INSERTs.

> I'm curious how cost of page deduplication is compared to cost of page split?
> Should we do deduplication of page will still remain 99% full?

It depends on how you measure it, but in general I would say that the
cost of traditional Postgres 13 deduplication is much lower.
Especially as measured in WAL volume. I also believe that the same is
true for this new delete deduplication mechanism. The way we determine
which heap pages to visit maximizes the chances that we'll get lucky
while minimizing the cost (currently we don't visit more than 2 heap
pages unless we get at least one kill -- I think I could be more
conservative here without losing much). A page split also has to
exclusive lock two other pages (the right sibling page and parent
page), so even the locking is perhaps better.

The attached patch can completely or almost completely avoid index
bloat in extreme cases with non-HOT updates. This can easily increase
throughput by 2x or more, depending on how extreme you make it (i.e.
how many indexes you have). It seems like the main problem caused by
non-HOT updates is in fact page splits themselves. It is not so much a
problem with dirtying of pages.

You can test this with a benchmark like the one that was used for WARM
back in 2017:

https://www.postgresql.org/message-id/flat/CABOikdMNy6yowA%2BwTGK9RVd8iw%2BCzqHeQSGpW7Yka_4RSZ_LOQ%40mail.gmail.com

I suspect that it's maybe even more effective than WARM was with this benchmark.

[1] https://fooledbyrandomness.com/ConvexityScience.pdf
--
Peter Geoghegan

v2-0001-Add-delete-deduplication-to-nbtree.patch
Description: Binary data

Re: Deleting older versions in unique indexes to avoid page splits

Reply via email to