On 27/09/2023 18:47, Robert Haas wrote:
On Wed, Sep 27, 2023 at 11:03 AM Jeff Davis <pg...@j-davis.com> wrote:
So it looks like it's intentionally registering a clean buffer so that
it can take a cleanup lock for reasons other than cleaning (or even
modiying) the page. I would think that there's a better way of
accomplishing that goal, so perhaps we can fix that?
I had forgotten some of the details of how this works until you
reminded me, but now that you've jarred my memory, I remember some of
it.
When Amit Kaplla and I were working on the project to add WAL-logging
to hash indexes, we ran into some problems with concurrency control
for individual buckets within the hash index. Before that project,
this was handled using heavyweight locks, one per bucket. That got
changed in 6d46f4783efe457f74816a75173eb23ed8930020 for the reasons
explained in that commit message. Basically, instead of taking
heavyweight locks, we started taking cleanup locks on the primary
bucket pages. I always thought that was a little awkward, but I didn't
quite see how to do better. I don't think that I gave much thought at
the time to the consequence you've uncovered here, namely that it
means we're sometimes locking one page (the primary bucket page)
because we want to do something to another bucket page (some page in
the linked list of pages that are part of that bucket) and for that to
be safe, we need to lock out concurrent scans of that bucket.
I guess I don't know of any easy fix here. :-(
That seems OK. Maybe there would be a better way to do that, but there's
nothing wrong with that approach per se.
We could define a new REGBUF_NO_CHANGE flag to signal that the buffer is
registered just for locking purposes, and not actually modified by the
WAL record.
--
Heikki Linnakangas
Neon (https://neon.tech)