On Tue, Jun 5, 2018 at 7:35 PM, Alexander Korotkov < a.korot...@postgrespro.ru> wrote:
> On Tue, Jun 5, 2018 at 4:02 PM Andres Freund <and...@anarazel.de> wrote: > > On 2018-06-05 13:09:08 +0300, Alexander Korotkov wrote: > > > It appears that buffer replacement happening inside relation > > > extension lock is affected by starvation on exclusive buffer mapping > > > lwlocks and buffer content lwlocks, caused by many concurrent shared > > > lockers. So, fair lwlock patch have no direct influence to relation > > > extension lock, which is naturally not even lwlock... > > > > Yea, that makes sense. I wonder how much the fix here is to "pre-clear" > > a victim buffer, and how much is a saner buffer replacement > > implementation (either by going away from O(NBuffers), or by having a > > queue of clean victim buffers like my bgwriter replacement). > > The particular thing I observed on our environment is BufferAlloc() > waiting hours on buffer partition lock. Increasing NUM_BUFFER_PARTITIONS > didn't give any significant help. It appears that very hot page (root > page of > some frequently used index) reside on that partition, so this partition was > continuously under shared lock. So, in order to resolve without changing > LWLock, we probably should move our buffers hash table to something > lockless. > > I think Robert's chash stuff [1] might be helpful to reduce the contention you are seeing. [1] - https://www.postgresql.org/message-id/CA%2BTgmoYE4t-Pt%2Bv08kMO5u_XN-HNKBWtfMgcUXEGBrQiVgdV9Q%40mail.gmail.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com