On 2021-May-13, Tom Lane wrote: > BTW, another nasty thing I discovered while testing this is that > the CHECK_FOR_INTERRUPTS() at line 2146 is useless, because > we're holding a buffer lock there so InterruptHoldoffCount > 0. > So once you get into this loop you can't even cancel the query. > Seems like that needs a fix, too.
This comment made me remember a patch I've had for a while, which splits the CHECK_FOR_INTERRUPTS() definition in two -- one of them is INTERRUPTS_PENDING_CONDITION() which let us test the condition separately; that allows the lock we hold to be released prior to actually processing the interrupts. The btree code modified was found to be an actual problem in production when a btree is corrupted in such a way that vacuum would get an infinite loop. I don't remember the exact details but I think we saw vacuum running for a couple of weeks, and had to restart the server in order to terminate it (since it wouldn't respond to signals). -- Álvaro Herrera Valdivia, Chile "I am amazed at [the pgsql-sql] mailing list for the wonderful support, and lack of hesitasion in answering a lost soul's question, I just wished the rest of the mailing list could be like this." (Fotis) (http://archives.postgresql.org/pgsql-sql/2006-06/msg00265.php)
>From 5a008141f135bef5ba933b1e3b65c457f58ad85a Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Thu, 13 May 2021 11:41:19 -0400 Subject: [PATCH] Split CHECK_FOR_INTERRUPTS This allows the condition to be checked even when in an interrupts-held situation, so that we can exit that (eg. release some lock we know we're holding) in order to process them. --- src/backend/access/nbtree/nbtpage.c | 7 +++++++ src/include/miscadmin.h | 20 ++++++++------------ 2 files changed, 15 insertions(+), 12 deletions(-) diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c index ebec8fa5b8..00de713035 100644 --- a/src/backend/access/nbtree/nbtpage.c +++ b/src/backend/access/nbtree/nbtpage.c @@ -2397,6 +2397,13 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno, { bool leftsibvalid = true; + if (INTERRUPTS_PENDING_CONDITION()) + { + _bt_relbuf(rel, leafbuf); + ProcessInterrupts(); + return false; /* should not occur */ + } + /* * Before we follow the link from the page that was the left * sibling mere moments ago, validate its right link. This diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h index 95202d37af..c5c441c2e9 100644 --- a/src/include/miscadmin.h +++ b/src/include/miscadmin.h @@ -98,23 +98,19 @@ extern PGDLLIMPORT volatile uint32 CritSectionCount; extern void ProcessInterrupts(void); #ifndef WIN32 +#define INTERRUPTS_PENDING_CONDITION() \ + (unlikely(InterruptPending)) +#else +#define INTERRUPTS_PENDING_CONDITION() \ + (unlikely(UNBLOCKED_SIGNAL_QUEUE()) ? pgwin32_dispatch_queued_signals() : 0, \ + unlikely(InterruptPending)) +#endif #define CHECK_FOR_INTERRUPTS() \ do { \ - if (unlikely(InterruptPending)) \ + if (INTERRUPTS_PENDING_CONDITION()) \ ProcessInterrupts(); \ } while(0) -#else /* WIN32 */ - -#define CHECK_FOR_INTERRUPTS() \ -do { \ - if (unlikely(UNBLOCKED_SIGNAL_QUEUE())) \ - pgwin32_dispatch_queued_signals(); \ - if (unlikely(InterruptPending)) \ - ProcessInterrupts(); \ -} while(0) -#endif /* WIN32 */ - #define HOLD_INTERRUPTS() (InterruptHoldoffCount++) -- 2.20.1