On Mon, Apr 1, 2013 at 4:09 PM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-04-01 08:28:13 -0500, Merlin Moncure wrote: >> On Sun, Mar 31, 2013 at 1:27 PM, Jeff Janes <jeff.ja...@gmail.com> wrote: >> > On Friday, March 22, 2013, Ants Aasma wrote: >> >> >> >> On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure <mmonc...@gmail.com> >> >> wrote: >> >> > well if you do a non-locking test first you could at least avoid some >> >> > cases (and, if you get the answer wrong, so what?) by jumping to the >> >> > next buffer immediately. if the non locking test comes good, only >> >> > then do you do a hardware TAS. >> >> > >> >> > you could in fact go further and dispense with all locking in front of >> >> > usage_count, on the premise that it's only advisory and not a real >> >> > refcount. so you only then lock if/when it's time to select a >> >> > candidate buffer, and only then when you did a non locking test first. >> >> > this would of course require some amusing adjustments to various >> >> > logical checks (usage_count <= 0, heh). >> >> >> >> Moreover, if the buffer happens to miss a decrement due to a data >> >> race, there's a good chance that the buffer is heavily used and >> >> wouldn't need to be evicted soon anyway. (if you arrange it to be a >> >> read-test-inc/dec-store operation then you will never go out of >> >> bounds) However, clocksweep and usage_count maintenance is not what is >> >> causing contention because that workload is distributed. The issue is >> >> pinning and unpinning. >> > >> > >> > That is one of multiple issues. Contention on the BufFreelistLock is >> > another one. I agree that usage_count maintenance is unlikely to become a >> > bottleneck unless one or both of those is fixed first (and maybe not even >> > then) >> >> usage_count manipulation is not a bottleneck but that is irrelevant. >> It can be affected by other page contention which can lead to priority >> inversion. I don't be believe there is any reasonable argument that >> sitting and spinning while holding the BufFreelistLock is a good idea. > > In my experience the mere fact of (unlockedly, but still) accessing all the > buffer headers can cause noticeable slowdowns in write only/mostly workloads > with > big amounts of shmem. > Due to the write only nature large amounts of the buffers have a similar > usagecounts (since they are infrequently touched after the initial insertion) > and there are no free ones around so the search for a buffer frequently runs > through *all* buffer headers multiple times till it decremented all > usagecounts > to 0. Then comes a period where free buffers are found easily (since all > usagecounts from the current sweep point onwards are zero). After that it > starts all over. > I now have seen that scenario multiple times :(
Interesting -- I was thinking about that too, but it's a separate problem with a different trigger. Maybe a bailout should be in there so that after X usage_count adjustments the sweeper summarily does an eviction, or maybe the "max" declines from 5 once per hundred buffers inspected or some such. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers