On Wed, Sep 27, 2023 at 11:03 AM Jeff Davis <pg...@j-davis.com> wrote: > So it looks like it's intentionally registering a clean buffer so that > it can take a cleanup lock for reasons other than cleaning (or even > modiying) the page. I would think that there's a better way of > accomplishing that goal, so perhaps we can fix that?
I had forgotten some of the details of how this works until you reminded me, but now that you've jarred my memory, I remember some of it. When Amit Kaplla and I were working on the project to add WAL-logging to hash indexes, we ran into some problems with concurrency control for individual buckets within the hash index. Before that project, this was handled using heavyweight locks, one per bucket. That got changed in 6d46f4783efe457f74816a75173eb23ed8930020 for the reasons explained in that commit message. Basically, instead of taking heavyweight locks, we started taking cleanup locks on the primary bucket pages. I always thought that was a little awkward, but I didn't quite see how to do better. I don't think that I gave much thought at the time to the consequence you've uncovered here, namely that it means we're sometimes locking one page (the primary bucket page) because we want to do something to another bucket page (some page in the linked list of pages that are part of that bucket) and for that to be safe, we need to lock out concurrent scans of that bucket. I guess I don't know of any easy fix here. :-( -- Robert Haas EDB: http://www.enterprisedb.com