On 2021-May-13, Tom Lane wrote:

> BTW, another nasty thing I discovered while testing this is that
> the CHECK_FOR_INTERRUPTS() at line 2146 is useless, because
> we're holding a buffer lock there so InterruptHoldoffCount > 0.
> So once you get into this loop you can't even cancel the query.
> Seems like that needs a fix, too.

This comment made me remember a patch I've had for a while, which splits
the CHECK_FOR_INTERRUPTS() definition in two -- one of them is
INTERRUPTS_PENDING_CONDITION() which let us test the condition
separately; that allows the lock we hold to be released prior to
actually processing the interrupts.

The btree code modified was found to be an actual problem in production
when a btree is corrupted in such a way that vacuum would get an
infinite loop.  I don't remember the exact details but I think we saw
vacuum running for a couple of weeks, and had to restart the server in
order to terminate it (since it wouldn't respond to signals).

-- 
Álvaro Herrera       Valdivia, Chile
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this."                               (Fotis)
               (http://archives.postgresql.org/pgsql-sql/2006-06/msg00265.php)
>From 5a008141f135bef5ba933b1e3b65c457f58ad85a Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 13 May 2021 11:41:19 -0400
Subject: [PATCH] Split CHECK_FOR_INTERRUPTS

This allows the condition to be checked even when in an interrupts-held
situation, so that we can exit that (eg. release some lock we know we're
holding) in order to process them.
---
 src/backend/access/nbtree/nbtpage.c |  7 +++++++
 src/include/miscadmin.h             | 20 ++++++++------------
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index ebec8fa5b8..00de713035 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -2397,6 +2397,13 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
 		{
 			bool		leftsibvalid = true;
 
+			if (INTERRUPTS_PENDING_CONDITION())
+			{
+				_bt_relbuf(rel, leafbuf);
+				ProcessInterrupts();
+				return false;	/* should not occur */
+			}
+
 			/*
 			 * Before we follow the link from the page that was the left
 			 * sibling mere moments ago, validate its right link.  This
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 95202d37af..c5c441c2e9 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -98,23 +98,19 @@ extern PGDLLIMPORT volatile uint32 CritSectionCount;
 extern void ProcessInterrupts(void);
 
 #ifndef WIN32
+#define INTERRUPTS_PENDING_CONDITION() \
+	(unlikely(InterruptPending))
+#else
+#define INTERRUPTS_PENDING_CONDITION() \
+	(unlikely(UNBLOCKED_SIGNAL_QUEUE()) ? pgwin32_dispatch_queued_signals() : 0,  \
+	 unlikely(InterruptPending))
+#endif
 
 #define CHECK_FOR_INTERRUPTS() \
 do { \
-	if (unlikely(InterruptPending)) \
+	if (INTERRUPTS_PENDING_CONDITION()) \
 		ProcessInterrupts(); \
 } while(0)
-#else							/* WIN32 */
-
-#define CHECK_FOR_INTERRUPTS() \
-do { \
-	if (unlikely(UNBLOCKED_SIGNAL_QUEUE())) \
-		pgwin32_dispatch_queued_signals(); \
-	if (unlikely(InterruptPending)) \
-		ProcessInterrupts(); \
-} while(0)
-#endif							/* WIN32 */
-
 
 #define HOLD_INTERRUPTS()  (InterruptHoldoffCount++)
 
-- 
2.20.1

Reply via email to