On Mon, Jan 25, 2021 at 10:48 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > I need to spend more time on benchmarking to study the behavior and I > think without that it would be difficult to make a conclusion in this > regard. So, let's not consider any action on this front till I spend > more time to find the details.
It is true that I committed the patch without thorough review, which was less than ideal. I welcome additional review from you now. I will say one more thing about it for now: Start with a workload, not with the code. Without bottom-up deletion (e.g. when using Postgres 13) with a simple though extreme workload that will experience version churn in indexes after a while, it still takes quite a few minutes for the first page to split (when the table is at least a few GB in size to begin with). When I was testing the patch I would notice that it could take 10 or 15 minutes for the deletion mechanism to kick in for the first time -- the patch really didn't do anything at all until perhaps 15 minutes into the benchmark, despite helping *enormously* by the 60 minute mark. And this is with significant skew, so presumably the first page that would split (in the absence of the bottom-up deletion feature) was approximately the page with the most skew -- most individual pages might have taken 30 minutes or more to split without the intervention of bottom-up deletion. Relatively rare events (in this case would-be page splits) can have very significant long term consequences for the sustainability of a workload, so relatively simple targeted interventions can make all the difference. The idea behind bottom-up deletion is to allow the workload to figure out the best way of fixing its bloat problems *naturally*. The heuristics must be simple precisely because workloads are so varied and complicated. We must be willing to pay small fixed costs for negative feedback -- it has to be okay for the mechanism to occasionally fail in order to learn what works. I freely admit that I don't understand all workloads. But I don't think anybody can. This holistic/organic approach has a lot of advantages, especially given the general uncertainty about workload characteristics. Your suspicion of the simple nature of the heuristics actually makes a lot of sense to me. I do get it. -- Peter Geoghegan