On Mon, Oct 7, 2019 at 4:03 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > I'm not following your point? Whatever you might think the appropriate > response is, I'm pretty sure "elog(FATAL) out of the postmaster" is not > it. Moreover, we have to --- and already do, I trust --- deal with > other resource-exhaustion errors in exactly the same code path, notably > fork(2) failure which we simply can't predict or prevent. Doesn't the > parallel query logic already deal sanely with failure to obtain as many > workers as it wanted?
If we fail to obtain workers because there are not adequate workers slots available, parallel query deals with that smoothly. But, once we have a slot, any further failure will trigger the parallel query to ERROR out. For the case where we get a slot but can't start the worker process, see WaitForParallelWorkersToFinish and/or WaitForParallelWorkersToAttach and comments therein. Once we're attached, any error messages thrown by the worker are propagated back to the master; see HandleParallelMessages and pq_redirect_to_shm_mq. Now you could argue that the master ought to selectively ignore certain kinds of errors and just continue on, while rethrowing others, say based on the errcode(). Such design ideas have been roundly panned in other contexts, though, so I'm not sure it would be a great idea to do it here either. But in any case, it's not how the current system behaves, or was designed to behave. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company