On Tue, Apr 20, 2021 at 11:18 AM Andrey Borodin <x4...@yandex-team.ru> wrote: > BTW take a look into PGM [0]. I'm slowly working on implementing it. > I think it is kind of straightforward to implement it as extension. > I've started from forking B-tree[1]. I've removed support of anything that is > not int4. > Then I plan to remove opclass\comparator abstraction layer. > And then add interpolation search over the page before binsearch.
FWIW I'm a skeptic of learned indexes. There are lots of reasons why I don't think that they're practical, but it boils down to this: in general data structures that work well don't need anybody to make a case for them. They simply work well for the task they were designed for. Anything is possible, but I strongly doubt that learned indexes are the exception to that general rule. Why hasn't anybody been able to apply the techniques in any real world database system to date? I'm sure that it's possible to make things much faster if it's possible to make certain wide-reaching assumptions. They talk about big improvements in space utilization compared to B-Trees as if there was some very fixed idea of how that works in B-Trees. Why haven't they compared their model to grotty heuristics that practical experience has shown to work, such as the rightmost leaf page split heuristic? Any paper about learned indexes that I've ever read doesn't even acknowledge the existence of these heuristics. -- Peter Geoghegan