On Tue, Mar 14, 2023 at 11:20 AM Andres Freund <and...@anarazel.de> wrote: > On windows it looks like pids can't be reused as long as there are handles for > the process. Unfortunately, we close the handle for the process in > pgwin32_deadchild_callback(), which runs in a separate thread, so the pid can > be reused before we get to waitpid(). And thus it can happen while we start > new children.
Ahhh. Right, of course. The handle thing makes total sense now that you point it out, and although I couldn't find it in the fine manual, a higher authority has it in black and white[1]. Even without knowing which of those calls is releasing the process table entry, we're doing all of them on the wrong side of that IOCP. Alright, here is a patch to schlep most of that code over into waitpid(), where it belongs. [1] https://devblogs.microsoft.com/oldnewthing/20110107-00/?p=11803
From cd83520c7d9ca214c98b9c0d4c74c9371f5555b0 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Tue, 14 Mar 2023 11:53:26 +1300 Subject: [PATCH] Fix waitpid() emulation on Windows. Unix doesn't allow a PID to be recycled for a new process until after the earlier process is "reaped" by waitpid() (or similar functions). Our emulation didn't guarantee that, and the postmaster could finish up tracking multiple children with the same PID, and get very confused. Fix, by moving the guts of pgwin32_deadchild_callback() into waitpid(), so that resources are released synchronously. The process and PID continue to exist until we close the process handle, when we're ready to adjust our own bookkeeping of running children. This seems to explain a couple of failures on CI. It had never been reported before, despite the code being as old as the Windows port. Perhaps Windows started recycling PIDs more rapidly, or perhaps timing changes due to commit 7389aad6 made it more likely to break. Diagnosed-by: Andres Freund <and...@anarazel.de> Discussion: https://postgr.es/m/20230208012852.bvkn2am4h4iqjogq%40awork3.anarazel.de --- src/backend/postmaster/postmaster.c | 65 ++++++++++++++++------------- 1 file changed, 36 insertions(+), 29 deletions(-) diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 2552327d90..3da18ab6ac 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -6421,36 +6421,24 @@ ShmemBackendArrayRemove(Backend *bn) static pid_t waitpid(pid_t pid, int *exitstatus, int options) { + win32_deadchild_waitinfo *childinfo; + DWORD exitcode; DWORD dwd; ULONG_PTR key; OVERLAPPED *ovl; - /* - * Check if there are any dead children. If there are, return the pid of - * the first one that died. - */ - if (GetQueuedCompletionStatus(win32ChildQueue, &dwd, &key, &ovl, 0)) + Assert(pid == -1); + Assert(options == WNOHANG); + + /* Try to consume one win32_deadchild_waitinfo from the queue. */ + if (!GetQueuedCompletionStatus(win32ChildQueue, &dwd, &key, &ovl, 0)) { - *exitstatus = (int) key; - return dwd; + errno = EAGAIN; + return -1; } - return -1; -} - -/* - * Note! Code below executes on a thread pool! All operations must - * be thread safe! Note that elog() and friends must *not* be used. - */ -static void WINAPI -pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired) -{ - win32_deadchild_waitinfo *childinfo = (win32_deadchild_waitinfo *) lpParameter; - DWORD exitcode; - - if (TimerOrWaitFired) - return; /* timeout. Should never happen, since we use - * INFINITE as timeout value. */ + childinfo = (win32_deadchild_waitinfo *) key; + pid = childinfo->procId; /* * Remove handle from wait - required even though it's set to wait only @@ -6466,13 +6454,11 @@ pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired) write_stderr("could not read exit code for process\n"); exitcode = 255; } - - if (!PostQueuedCompletionStatus(win32ChildQueue, childinfo->procId, (ULONG_PTR) exitcode, NULL)) - write_stderr("could not post child completion status\n"); + *exitstatus = exitcode; /* - * Handle is per-process, so we close it here instead of in the - * originating thread + * Close the process handle. Only after this point can the PID can be + * recycled by the kernel. */ CloseHandle(childinfo->procHandle); @@ -6482,7 +6468,28 @@ pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired) */ free(childinfo); - /* Queue SIGCHLD signal */ + return pid; +} + +/* + * Note! Code below executes on a thread pool! All operations must + * be thread safe! Note that elog() and friends must *not* be used. + */ +static void WINAPI +pgwin32_deadchild_callback(PVOID lpParameter, BOOLEAN TimerOrWaitFired) +{ + /* Should never happen, since we use INFINITE as timeout value. */ + if (!TimerOrWaitFired) + return; + + /* Post the win32_deadchild_waitinfo object for waitpid() to deal with. */ + if (!PostQueuedCompletionStatus(win32ChildQueue, + 0, + (ULONG_PTR) lpParameter, + NULL)) + write_stderr("could not post child completion status\n"); + + /* Queue SIGCHLD signal. */ pg_queue_signal(SIGCHLD); } #endif /* WIN32 */ -- 2.39.1