Re: Missed check for too-many-children in bgworker spawning

Robert Haas Wed, 09 Oct 2019 07:11:32 -0700

On Mon, Oct 7, 2019 at 4:03 PM Tom Lane <[email protected]> wrote:
> I'm not following your point?  Whatever you might think the appropriate
> response is, I'm pretty sure "elog(FATAL) out of the postmaster" is not
> it.  Moreover, we have to --- and already do, I trust --- deal with
> other resource-exhaustion errors in exactly the same code path, notably
> fork(2) failure which we simply can't predict or prevent.  Doesn't the
> parallel query logic already deal sanely with failure to obtain as many
> workers as it wanted?


If we fail to obtain workers because there are not adequate workers
slots available, parallel query deals with that smoothly.  But, once
we have a slot, any further failure will trigger the parallel query to
ERROR out.  For the case where we get a slot but can't start the
worker process, see WaitForParallelWorkersToFinish and/or
WaitForParallelWorkersToAttach and comments therein. Once we're
attached, any error messages thrown by the worker are propagated back
to the master; see HandleParallelMessages and pq_redirect_to_shm_mq.

Now you could argue that the master ought to selectively ignore
certain kinds of errors and just continue on, while rethrowing others,
say based on the errcode(). Such design ideas have been roundly panned
in other contexts, though, so I'm not sure it would be a great idea to
do it here either. But in any case, it's not how the current system
behaves, or was designed to behave.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Missed check for too-many-children in bgworker spawning

Reply via email to