Hi, On 2023-04-01 13:42:21 +1300, Thomas Munro wrote: > Commit 7389aad6 started using WaitEventSetWait() to wait for incoming > connections. Before that, we used select(), for which we have our own > implementation for Windows. > > While hacking on patches to rip a bunch of unused Win32 socket wrapper > code out, I twigged that the old pgwin32_select() code was careful to > report multiple sockets at once by brute force polling of all of them, > while WaitEventSetWait() doesn't do that: it reports just one event, > because that's what the Windows WaitForMultipleEvents() syscall does. > I guess this means you can probably fill up the listen queue of server > socket 1 to prevent server socket 2 from ever being serviced, whereas > on Unix we'll accept one connection at a time from each in round-robin > fashion.
It does indeed sound like it'd behave that way: > If bWaitAll is FALSE, the return value minus WAIT_OBJECT_0 indicates the > lpHandles array index of the object that satisfied the wait. If more than > one object became signaled during the call, this is the array index of the > signaled object with the smallest index value of all the signaled objects. I wonder if we ought to bite the bullet and replace the use of WaitForMultipleObjects() with RegisterWaitForSingleObject() and then use GetQueuedCompletionStatus() to wait. The fairness issue here is a motivation, but the bigger one is that that'd get us out from under the MAXIMUM_WAIT_OBJECTS (IIRC 64) limit. Afaict that'd also allow us to read multiple notifications at once, using GetQueuedCompletionStatusEx(). Medium term that'd also be a small step towards using readiness based APIs in a few places... > I think we could get the same effect as pgwin32_select() more cheaply > by doing the initial WaitForMultipleEvents() call with the caller's > timeout exactly as we do today, and then, while we have space, > repeatedly calling > WaitForMultipleEvents(handles=&events[last_signaled_event_index + 1], > timeout=0) until it reports timeout. That would make sense, and should indeed be reasonable cost-wise. > I mention this now because I'm not sure whether to consider this an > 'open item' for 16, or merely an enhancement for 17. I guess the > former, because someone might call that a new denial of service > vector. On the other hand, if you fill up the listen queue for socket > 1 with enough vigour, you're also denying service to socket 1, so I > don't know if it's worth worrying about. Opinions on that? I'm not sure either. It doesn't strike me as a particularly relevant bottleneck. And the old approach of doing more work for every single connection also made many connections worse, I think? > diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c > index f4123e7de7..cc7b572008 100644 > --- a/src/backend/storage/ipc/latch.c > +++ b/src/backend/storage/ipc/latch.c > @@ -2025,6 +2025,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int > cur_timeout, > */ > cur_event = (WaitEvent *) &set->events[rc - WAIT_OBJECT_0 - 1]; > > +loop: > + As far as I can tell, we'll never see WL_LATCH_SET or WL_POSTMASTER_DEATH. I think it'd look cleaner to move the body of if (cur_event->events & WL_SOCKET_MASK) into a separate function that we then also can call further down. > occurred_events->pos = cur_event->pos; > occurred_events->user_data = cur_event->user_data; > occurred_events->events = 0; > @@ -2044,6 +2046,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int > cur_timeout, > occurred_events->events = WL_LATCH_SET; > occurred_events++; > returned_events++; > + nevents--; > } > } > else if (cur_event->events == WL_POSTMASTER_DEATH) > @@ -2063,6 +2066,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int > cur_timeout, > occurred_events->events = WL_POSTMASTER_DEATH; > occurred_events++; > returned_events++; > + nevents--; > } > } > else if (cur_event->events & WL_SOCKET_MASK) > @@ -2124,6 +2128,36 @@ WaitEventSetWaitBlock(WaitEventSet *set, int > cur_timeout, > { > occurred_events++; > returned_events++; > + nevents--; > + } > + } Seems like we could use returned_events to get nevents in the way you want it, without adding even more ++/-- to each of the different events? Greetings, Andres Freund