On Thu, Oct 15, 2020 at 8:40 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.mu...@gmail.com> writes: > > The process exit event is like an 'edge', not a 'level'... hmm. It > > might be enough to set report_postmaster_not_running = true the first > > time it tells us so if we try to wait again we'll treat it like a > > level. I will look into it later today. > > Seems like having that be per-WaitEventSet state is also not a great > idea --- if we detect PM death while waiting on one WES, and then > wait on another one, it won't work. A plain process-wide static > variable would be a better way I bet.
I don't think that's a problem -- the kernel will report the event to each interested kqueue object. The attached fixes the problem for me.
From 50e4c632385951cd37af746bc8b89d893181819f Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Wed, 14 Oct 2020 20:19:04 +1300 Subject: [PATCH] Make WL_POSTMASTER_DEATH level-triggered on kqueue builds. If WaitEventWait() reports that the postmaster has gone away, later calls to WaitEventWait() should continue to report that. Otherwise further waits that occur in the proc_exit() path after we already noticed the postmaster's demise could block forever. Back-patch to 13, where the kqueue support landed. Reported-by: Tom Lane <t...@sss.pgh.pa.us> Discussion: https://postgr.es/m/3624029.1602701929%40sss.pgh.pa.us --- src/backend/storage/ipc/latch.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 63c6c97536..663f7ba7eb 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -1492,7 +1492,10 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, timeout_p = &timeout; } - /* Report events discovered by WaitEventAdjustKqueue(). */ + /* + * Report postmaster events discovered by WaitEventAdjustKqueue() or + * earlier calls to WaitEventSetWait(). + */ if (unlikely(set->report_postmaster_not_running)) { if (set->exit_on_postmaster_death) @@ -1563,6 +1566,13 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, cur_kqueue_event->filter == EVFILT_PROC && (cur_kqueue_event->fflags & NOTE_EXIT) != 0) { + /* + * The kernel will tell this kqueue object only once about the exit + * of the postmaster, so let's remember that for next time so that + * we provide level-triggered semantics. + */ + set->report_postmaster_not_running = true; + if (set->exit_on_postmaster_death) proc_exit(1); occurred_events->fd = PGINVALID_SOCKET; -- 2.27.0