> On 30 Jun 2025, at 15:58, Andrey Borodin <x4...@yandex-team.ru> wrote:
>
> page_collect_tuples() holds a lock on the buffer while examining tuples
> visibility, having InterruptHoldoffCount > 0. Tuple visibility check might
> need WAL to go on, we have to wait until some next MX be filled in.
> Which might need a buffer lock or have a snapshot conflict with caller of
> page_collect_tuples().
Thinking more about the problem I see 3 ways to deal with this deadlock:
1. We check for recovery conflict even in presence of InterruptHoldoffCount.
That's what patch v4 does.
2. Teach page_collect_tuples() to do HeapTupleSatisfiesVisibility() without
holding buffer lock.
3. Why do we even HOLD_INTERRUPTS() when aquire shared lock??
Personally, I see point 2 as very invasive in a code that I'm not too familiar
with. Option 1 is clumsy. But option 3 is a giant system-wide change.
Yet, I see 3 as a correct solution. Can't we just abstain from
HOLD_INTERRUPTS() if taken LWLock is not exclusive?
Best regards, Andrey Borodin.