On Tue, Apr 15, 2014 at 7:30 PM, Peter Geoghegan <p...@heroku.com> wrote: > Frankly, there doesn't need to be any research on this, because it's > just common sense that probabilistically, leaf pages are much more > useful than heap pages in servicing index scan queries if we assume a > uniform distribution. If we don't assume that, then they're still more > useful on average.
I don't think "common sense" is compelling. I think you need to pin down exactly what it is about btree intermediate pages that the LRU isn't capturing and not just argue they're more useful. The LRU is already capturing which pages are more heavily used than others so you need to identify what it is that makes index pages *even more* useful than their frequency and recency of access indicates. Not just that they're more useful than an average page. So what I think is missing is that indexes are always accessed from the root down to the leaf. So the most recent page accessed will always be the leaf. And in whatever chain of pages was used to reach the last leaf page the least recently accessed will always be the root. But we'll need the root page again on the subsequent descent even if it's to reach the same leaf page we kept in ram in preference to it. Now it doesn't *always* make sense to keep an intermediate page over leaf pages. Imagine an index that we always do full traversals of. We'll always descend from the root down the left-most pages and then follow the right pointers across. All the other intermediate pages will be cold. If we do an occasional descent probing for other keys those leaf pages shouldn't be cached since they won't be needed again for the common full index traversals and the next occasional probe will probably be looking for different keys. But if we're often probing for the same keys the last thing we want to do is throw away one of the intermediate pages for those keys when we could throw away a leaf page. But that's what would happen in a strict LRU. It's almost like what we would really want to do is mark the pages as least recently used in the opposite order from the order they're actually accessed when descending. Or perhaps bump the usage count to max+1 when it's an intermediate page so that it takes one extra cycle of decrementing before it's considered old compared to a leaf page. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers