On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: > On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: > > On Friday, March 22, 2013, Ants Aasma wrote: > >> > >> On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure <mmonc...@gmail.com> > >> wrote: > >> > well if you do a non-locking test first you could at least avoid some > >> > cases (and, if you get the answer wrong, so what?) by jumping to the > >> > next buffer immediately. if the non locking test comes good, only > >> > then do you do a hardware TAS. > >> > > >> > you could in fact go further and dispense with all locking in front of > >> > usage_count, on the premise that it's only advisory and not a real > >> > refcount. so you only then lock if/when it's time to select a > >> > candidate buffer, and only then when you did a non locking test first. > >> > this would of course require some amusing adjustments to various > >> > logical checks (usage_count <= 0, heh). > >> > >> Moreover, if the buffer happens to miss a decrement due to a data > >> race, there's a good chance that the buffer is heavily used and > >> wouldn't need to be evicted soon anyway. (if you arrange it to be a > >> read-test-inc/dec-store operation then you will never go out of > >> bounds) However, clocksweep and usage_count maintenance is not what is > >> causing contention because that workload is distributed. The issue is > >> pinning and unpinning. > > > > > > That is one of multiple issues. Contention on the BufFreelistLock is > > another one. I agree that usage_count maintenance is unlikely to become a > > bottleneck unless one or both of those is fixed first (and maybe not even > > then) > > usage_count manipulation is not a bottleneck but that is irrelevant. > It can be affected by other page contention which can lead to priority > inversion. I don't be believe there is any reasonable argument that > sitting and spinning while holding the BufFreelistLock is a good idea.
In my experience the mere fact of (unlockedly, but still) accessing all the buffer headers can cause noticeable slowdowns in write only/mostly workloads with big amounts of shmem. Due to the write only nature large amounts of the buffers have a similar usagecounts (since they are infrequently touched after the initial insertion) and there are no free ones around so the search for a buffer frequently runs through *all* buffer headers multiple times till it decremented all usagecounts to 0. Then comes a period where free buffers are found easily (since all usagecounts from the current sweep point onwards are zero). After that it starts all over. I now have seen that scenario multiple times :( Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers