On Tue, Apr 15, 2014 at 9:44 PM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Apr 14, 2014 at 1:11 PM, Peter Geoghegan <p...@heroku.com> wrote: >> In the past, various hackers have noted problems they've observed with >> this scheme. A common pathology is to see frantic searching for a >> victim buffer only to find all buffer usage_count values at 5. It may >> take multiple revolutions of the clock hand before a victim buffer is >> found, as usage_count is decremented for each and every buffer. Also, >> BufFreelistLock contention is considered a serious bottleneck [1], >> which is related. > > I think that the basic problem here is that usage counts increase when > buffers are referenced, but they decrease when buffers are evicted, > and those two things are not in any necessary way connected to each > other. In particular, if no eviction is happening, reference counts > will converge to the maximum value. I've read a few papers about > algorithms that attempt to segregate the list of buffers into "hot" > and "cold" lists, and an important property of such algorithms is that > they mustn't be allowed to make everything hot.
It's possible that I've misunderstood what you mean here, but do you really think it's likely that everything will be hot, in the event of using something like what I've sketched here? I think it's an important measure against this general problem that buffers really earn the right to be considered hot, so to speak. With my prototype, in order for a buffer to become as hard to evict as possible, at a minimum it must be *continually* pinned for at least 30 seconds. That's actually a pretty tall order. Although, as I said, I wouldn't be surprised if it was worth making it possible for buffers to be even more difficult to evict than that. It should be extremely difficult to evict a root B-Tree page, and to a lesser extent inner pages even under a lot of cache pressure, for example. There are lots of workloads in which that can happen, and I have a hard time believing that it's worth it to evict given the extraordinary difference in their utility as compared to a lot of other things. I can imagine a huge barrier against evicting what is actually a relatively tiny number of pages being worth it. I don't want to dismiss what you're saying about heating and cooling being unrelated, but I don't find the conclusion that not everything can be hot obvious. Maybe "heat" should be relative rather than absolute, and maybe that's actually what you meant. There is surely some workload where buffer access actually is perfectly uniform, and what do you do there? What "temperature" are those buffers? It occurs to me that within the prototype patch, even though usage_count is incremented in a vastly slower fashion (in a wall time sense), clock sweep doesn't take advantage of that. I should probably investigate having clock sweep become more aggressive in decrementing in response to realizing that it won't get some buffer's usage_count down to zero on the next revolution either. There are certainly problems with that, but they might be fixable. Within the patch, in order for it to be possible for the usage_count to be incremented in the interim, an average of 1.5 seconds must pass, so if clock sweep were to anticipate another no-set-to-zero revolution, it seems pretty likely that it would be exactly right, or if not then close enough, since it can only really fail to correct for some buffers getting incremented once more in the interim. Conceptually, it would be like multiple logical revolutions were merged into one actual one, sufficient to have the next revolution find a victim buffer. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers