On Wed, Apr 3, 2013 at 9:49 PM, Greg Smith <g...@2ndquadrant.com> wrote: > On 4/2/13 11:54 AM, Robert Haas wrote: >> But, having said that, I still think the best idea is what Andres >> proposed, which pretty much matches my own thoughts: the bgwriter >> needs to populate the free list, so that buffer allocations don't have >> to wait for linear scans of the buffer array. > > I was hoping this one would make it to a full six years of being on the TODO > list before it came up again, missed it by a few weeks. The funniest part > is that Amit even submitted a patch on this theme a few months ago without > much feedback: > http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C382852FF97@szxeml509-mbs > That stalled where a few things have, on a) needing more regression test > workloads, and b) wondering just what the deal with large shared_buffers > setting degrading performance was.
Those are impressive results. I think we should seriously consider doing something like that for 9.4. TBH, although more workloads to test is always better, I don't think this problem is so difficult that we can't have some confidence in a theoretical analysis. If I read the original thread correctly (and I haven't looked at the patch itself), the proposed patch would actually invalidate buffers before putting them on the freelist. That effectively amounts to reducing shared_buffers, so workloads that are just on the edge of what can fit in shared_buffers will be harmed, and those that benefit incrementally from increased shared_buffers will be as well. What I think we should do instead is collect the buffers that we think are evictable and stuff them onto the freelist without invalidating them. When a backend allocates from the freelist, it can double-check that the buffer still has usage_count 0. The odds should be pretty good. But even if we sometimes notice that the buffer has been touched again after being put on the freelist, we haven't expended all that much extra effort, and that effort happened mostly in the background. Consider a scenario where only 10% of the buffers have usage count 0 (which is not unrealistic). We scan 5000 buffers and put 500 on the freelist. Now suppose that, due to some accident of the workload, 75% of those buffers get touched again before they're allocated off the freelist (which I believe to be a pessimistic estimate for most workloads). Now, that means that only 125 of those 500 buffers will succeed in satisfying an allocation request. That's still a huge win, because it means that each backend only has examine an average of 4 buffers before it finds one to allocate. If it had needed to do the freelist scan itself, it would have had to touch 40 buffers before finding one to allocate. In real life, I think the gains are apt to be, if anything, larger. IME, it's common for most or all of the buffer pool to be pinned at usage count 5. So you could easily have a situation where the arena scan has to visit millions of buffers to find one to allocate. If that's happening in the background instead of the foreground, it's a huge win. Also, note that there's nothing to prevent the arena scan from happening in parallel with allocations off of the freelist - so while foreground processes are emptying the freelist, the background process can be looking for more things to add to it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers