Robert Haas <robertmh...@gmail.com> Thursday 24 March 2011 22:41:19 > On Thu, Mar 24, 2011 at 5:34 PM, Greg Stark <gsst...@mit.edu> wrote: > > On Thu, Mar 24, 2011 at 8:59 PM, Robert Haas <robertmh...@gmail.com> wrote: > >> It seems at least plausible that buffer allocation could be > >> significantly faster if it need only pop the head of a list, rather > >> than scanning until it finds a suitable candidate. Moving as much > >> buffer allocation work as possible into the background seems like it > >> ought to be useful. > > > > Linked lists are notoriously non-concurrent, that's the whole reason > > for the clock sweep algorithm to exist at all instead of just using an > > LRU directly. That said, an LRU needs to be able to remove elements > > from the middle and not just enqueue elements on the tail, so the > > situation isn't exactly equivalent. > > > > Just popping off the head is simple enough but the bgwriter would need > > to be able to add elements to the tail of the list and the people > > popping elements off the head would need to compete with it for the > > lock on the list. And I think you need a single lock for the whole > > list because of the cases where the list becomes a single element or > > empty. > > > > The main impact this list would have is that it would presumably need > > some real number of buffers to satisfy the pressure for victim buffers > > for a real amount of time. That would represent a decrease in cache > > size, effectively evicting buffers from cache as if the cache were > > smaller by that amount. > > > > Theoretical results are that a small change in cache size affects > > cache hit rates substantially. I'm not sure that's born out by > > practical experience with Postgres though. People tend to either be > > doing mostly i/o or very little i/o. Cache hit rate only really > > matters and is likely to be affected by small changes in cache size in > > the space in between > > You wouldn't really have to reduce the effective cache size - there's > logic in there to just skip to the next buffer if the first one you > pull off the freelist has been reused. But the cache hit rates on > those buffers would (you'd hope) be fairly low, since they are the > ones we're about to reuse. Maybe it doesn't work out to a win, > though. If I may, Under unnormal circumstances (like current process is "held" by kernel) obtaining buffer from list may be cheaper this code while (StrategyControl->firstFreeBuffer >= 0) { buf = &BufferDescriptors[StrategyControl->firstFreeBuffer]; Assert(buf->freeNext != FREENEXT_NOT_IN_LIST);
/* Unconditionally remove buffer from freelist */ StrategyControl->firstFreeBuffer = buf->freeNext; buf->freeNext = FREENEXT_NOT_IN_LIST; could look do { SpinLock(); if (StrategyControl->firstFreeBuffer >= 0) { Unspin() break; } buf = &BufferDescriptors[StrategyControl->firstFreeBuffer]; Unspin(); Assert(buf->freeNext != FREENEXT_NOT_IN_LIST); /* Unconditionally remove buffer from freelist */ StrategyControl->firstFreeBuffer = buf->freeNext; buf->freeNext = FREENEXT_NOT_IN_LIST;like this }while(true); and aquirng spin lock for linked list is enaugh, and cheaper then taking lwlock is more complex than spin on this. after this simmilary with spin lock on trycounter = NBuffers*4; for (;;) { spinlock() buf = &BufferDescriptors[StrategyControl->nextVictimBuffer]; if (++StrategyControl->nextVictimBuffer >= NBuffers) { StrategyControl->nextVictimBuffer = 0; StrategyControl->completePasses++; } unspin(); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers