On Thu, Dec 12, 2024 at 11:36 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Thu, Dec 12, 2024 at 9:43 AM Nathan Bossart <nathandboss...@gmail.com> 
> wrote:
> > Our theory is that commit 7389aad (and follow-ups like commit 239b175) made
> > parallel worker processing much more responsive to the point of contending
> > with incoming connections, and that before this change, the kernel balanced
> > the execution of the signal handlers and ServerLoop() to prevent this.  I
> > don't have a concrete proposal yet, but I thought it was still worth
> > starting a discussion.  TBH I'm not sure we really need to do anything
> > since this arguably comes down to a trade-off between connection and worker
> > responsiveness.
>
> One factor is:
>
>          * Check if the latch is set already. If so, leave the loop
>          * immediately, avoid blocking again. We don't attempt to report any
>          * other events that might also be satisfied.

Here's an experimental patch to try changing that policy.  It improves
the connection times on my small computer with your test, but I doubt
I'm seeing the real issue.  But in theory, assuming a backlog of
connections and workers to start, I think each server loop should be
able to accept and fork one client backend, and fork up to 100
(MAX_BGWORKERS_TO_LAUNCH) background workers.
From 0444997d07342d733320743213cec736b0d008d0 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Thu, 12 Dec 2024 23:46:50 +1300
Subject: [PATCH v1] Latches shouldn't starve socket event delivery.

If a WaitEventSetWait() caller wants multiple events, an already-set
latch shouldn't prevent other events from being reported at the same
time.  It should certainly prevent sleeping, but that just requires
using a zero timeout when polling the kernel.

The main caller of WaitEventSetWait() with nevents > 1 is the
postmaster.  Even if its latch is frequently set, we don't want socket
accept processing to be starved.

XXX experimental

Reported-by: Nathan Bossart <nathandboss...@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLOcxUa6m7UinPN1gZXFyr92L8btG_pGTHPiWY2YbRw2w%40mail.gmail.com
---
 src/backend/storage/ipc/latch.c | 35 ++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index f19ce5e7aff..67ace0964d7 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -1452,9 +1452,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 		int			rc;
 
 		/*
-		 * Check if the latch is set already. If so, leave the loop
-		 * immediately, avoid blocking again. We don't attempt to report any
-		 * other events that might also be satisfied.
+		 * Check if the latch is set already first.  If so, we either exit
+		 * immediately or ask the kernel for any more events available right
+		 * now without waiting, depending on how many events the caller wants.
 		 *
 		 * If someone sets the latch between this and the
 		 * WaitEventSetWaitBlock() below, the setter will write a byte to the
@@ -1499,7 +1499,16 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 			/* could have been set above */
 			set->latch->maybe_sleeping = false;
 
-			break;
+			if (returned_events == nevents)
+				break;	/* output buffer full already */
+
+			/*
+			 * Even though we already have an event, we'll poll just once with
+			 * zero timeout to see what non-latch events we can fit into the
+			 * output buffer at the same time.
+			 */
+			cur_timeout = 0;
+			timeout = 0;
 		}
 
 		/*
@@ -1508,18 +1517,16 @@ WaitEventSetWait(WaitEventSet *set, long timeout,
 		 * to retry, everything >= 1 is the number of returned events.
 		 */
 		rc = WaitEventSetWaitBlock(set, cur_timeout,
-								   occurred_events, nevents);
+								   occurred_events, nevents - returned_events);
 
-		if (set->latch)
-		{
-			Assert(set->latch->maybe_sleeping);
+		if (set->latch &&
+			set->latch->maybe_sleeping)
 			set->latch->maybe_sleeping = false;
-		}
 
 		if (rc == -1)
 			break;				/* timeout occurred */
 		else
-			returned_events = rc;
+			returned_events += rc;
 
 		/* If we're not done, update cur_timeout for next iteration */
 		if (returned_events == 0 && timeout >= 0)
@@ -1607,7 +1614,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			/* Drain the signalfd. */
 			drain();
 
-			if (set->latch && set->latch->is_set)
+			if (set->latch && set->latch->maybe_sleeping && set->latch->is_set)
 			{
 				occurred_events->fd = PGINVALID_SOCKET;
 				occurred_events->events = WL_LATCH_SET;
@@ -1766,7 +1773,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 		if (cur_event->events == WL_LATCH_SET &&
 			cur_kqueue_event->filter == EVFILT_SIGNAL)
 		{
-			if (set->latch && set->latch->is_set)
+			if (set->latch && set->latch->maybe_sleeping && set->latch->is_set)
 			{
 				occurred_events->fd = PGINVALID_SOCKET;
 				occurred_events->events = WL_LATCH_SET;
@@ -1891,7 +1898,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			/* There's data in the self-pipe, clear it. */
 			drain();
 
-			if (set->latch && set->latch->is_set)
+			if (set->latch && set->latch->maybe_sleeping && set->latch->is_set)
 			{
 				occurred_events->fd = PGINVALID_SOCKET;
 				occurred_events->events = WL_LATCH_SET;
@@ -2107,7 +2114,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			if (!ResetEvent(set->handles[cur_event->pos + 1]))
 				elog(ERROR, "ResetEvent failed: error code %lu", GetLastError());
 
-			if (set->latch && set->latch->is_set)
+			if (set->latch && set->latch->maybe_sleeping && set->latch->is_set)
 			{
 				occurred_events->fd = PGINVALID_SOCKET;
 				occurred_events->events = WL_LATCH_SET;
-- 
2.39.5

Reply via email to