I wrote: > I've not tried to trace the code, but I'm now a bit suspicious > that there is indeed a design bug here. I gather from the > comments that parallel_register_count is incremented by the > worker processes, which of course implies that a worker that > fails to reattach to shared memory won't do that. But > parallel_terminate_count is incremented by the postmaster. > If the postmaster will do that even in the case of a worker that > failed at startup, then lorikeet's symptoms are neatly explained.
That theory seems to be nonsense. After a bit more study of the code, I see that parallel_register_count is incremented by the *leader* process, when it reserves a BackgroundWorkerSlot for the worker. And parallel_terminate_count is incremented by the postmaster when it releases the slot; so it's darn hard to see how parallel_terminate_count could get ahead of parallel_register_count. I noticed that lorikeet's worker didn't fail at shared memory reattach, as I first thought, anyway. It failed at ERROR: could not map dynamic shared memory segment which means we ought to be able to reproduce the symptoms by faking failure of dsm_attach(), as I did in the quick hack attached. What I get is a lot of "parallel worker failed to initialize" and "lost connection to parallel worker" errors, but no assertion. (I also tried this with an EXEC_BACKEND build, just in case that'd change the behavior, but it didn't.) So it seems like the "lorikeet is flaky" theory is looking pretty plausible. I do see what seems to be a bug-let in ForgetBackgroundWorker. BackgroundWorkerStateChange is careful to do this when freeing a slot: /* * We need a memory barrier here to make sure that the load of * bgw_notify_pid and the update of parallel_terminate_count * complete before the store to in_use. */ notify_pid = slot->worker.bgw_notify_pid; if ((slot->worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0) BackgroundWorkerData->parallel_terminate_count++; pg_memory_barrier(); slot->pid = 0; slot->in_use = false; but the mainline case in ForgetBackgroundWorker is a lot less paranoid: Assert(rw->rw_shmem_slot < max_worker_processes); slot = &BackgroundWorkerData->slot[rw->rw_shmem_slot]; if ((rw->rw_worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0) BackgroundWorkerData->parallel_terminate_count++; slot->in_use = false; One of these functions is mistaken. However, I can't construct a theory whereby that explains lorikeet's symptoms, mainly because Intel chips don't do out-of-order stores so the messing with parallel_terminate_count should be done before in_use is cleared, even without an explicit memory barrier. regards, tom lane
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c index 3550ef13ba..d3b00c2f9e 100644 --- a/src/backend/access/transam/parallel.c +++ b/src/backend/access/transam/parallel.c @@ -1278,6 +1278,11 @@ ParallelWorkerMain(Datum main_arg) "Parallel worker", ALLOCSET_DEFAULT_SIZES); + if (random() < INT_MAX / 64) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("fake failure to map dynamic shared memory segment"))); + /* * Attach to the dynamic shared memory segment for the parallel query, and * find its table of contents.