On 2025-Jul-17, Andrey Borodin wrote:

> Thinking more about the problem I see 3 ways to deal with this deadlock:
> 1. We check for recovery conflict even in presence of
> InterruptHoldoffCount. That's what patch v4 does.
> 2. Teach page_collect_tuples() to do HeapTupleSatisfiesVisibility()
> without holding buffer lock.
> 3. Why do we even HOLD_INTERRUPTS() when aquire shared lock??

Hmm, as you say, doing (3) is a very invasive system-wide change, but
can we do it more localized?  I mean, what if we do RESUME_INTERRUPTS()
just before going to sleep on the CV, and restore with HOLD_INTERRUPTS()
once the sleep is done?  That would only affect this one place rather
than the whole system, and should also (AFAICS) solve the issue.


> Yet, I see 3 as a correct solution. Can't we just abstain from
> HOLD_INTERRUPTS() if taken LWLock is not exclusive?

Hmm, the code in LWLockAcquire says

        /*
         * Lock out cancel/die interrupts until we exit the code section 
protected
         * by the LWLock.  This ensures that interrupts will not interfere with
         * manipulations of data structures in shared memory.
         */
        HOLD_INTERRUPTS();

which means if we want to change this, we would have to inspect every
single use of LWLocks in shared mode in order to be certain that such a
change isn't problematic.  This is a discussion I'm not prepared for.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"


Reply via email to