On Wed, Apr 19, 2023 at 10:43 AM Andres Freund <and...@anarazel.de> wrote: > I don't think the "evict on every buffer access" works quite that way - unless > you have a completely even access pattern, buffer access frequency will > increase usage count much more frequently on some buffers than others. And if > you have a completely even access pattern, it's hard to come up with a good > cache replacement algorithm...
My guess is that the most immediate problem in this area is the problem of "correlated references" (to use the term from the LRU-K paper). I gave an example of that here: https://postgr.es/m/CAH2-Wzk7T9K3d9_NY+jEXp2qQGMYoP=gzmor8q1cv57sxaw...@mail.gmail.com In other words, I think that the most immediate problem may in fact be the tendency of usage_count to get incremented multiple times in response to what is (for all intents and purposes) the same logical page access. Even if it's not as important as I imagine, it still seems likely that verifying that our input information isn't garbage is the logical place to begin work in this general area. It's difficult to be sure about that because it's so hard to look at just one problem in isolation. I suspect that you were right to point out that a larger shared buffers tends to look quite different to a smaller shared buffers. That same factor is going to complicate any analysis of the specific problem that I've highlighted (to say nothing of the way that contention complicates the picture). There is an interesting paper that compared the hit rates seen for TPC-C to TPC-E on relatively modern hardware: https://www.cs.cmu.edu/~chensm/papers/TPCE-sigmod-record10.pdf It concludes that the buffer misses for each workload look rather similar, past a certain point (past a certain buffer pool size): both workloads have cache misses that seem totally random. The access patterns may be very different, but that doesn't necessarily have any visible effect on buffer misses. At least provided that you make certain modest assumptions about buffer pool size, relative to working set size. The most sophisticated cache management algorithms (like ARC) work by maintaining metadata about recently evicted buffers, which is used to decide whether to favor recency over frequency. If you work backwards then it follows that having cache misses that look completely random is desirable, and perhaps even something to work towards. What you really don't want is a situation where the same small minority of pages keep getting ping-ponged into and out of the buffer pool, without ever settling, even though the buffer cache is large enough that that's possible in principle. That pathological profile is the furthest possible thing from random. With smaller shared_buffers, it's perhaps inevitable that buffer cache misses are random, and so I'd expect that managing the problem of contention will tend to matter most. With larger shared_buffers it isn't inevitable at all, so the quality of the cache eviction scheme is likely to matter quite a bit more. -- Peter Geoghegan