Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Andres Freund
Hi, On 2019-11-04 14:58:20 -0500, Robert Haas wrote: > On Mon, Nov 4, 2019 at 2:04 PM Andres Freund wrote: > > Is that really true? In the case where it started and failed we except > > the error queue to have been attached to, and there to be either an > > error 'E' or a 'X' response (cf HandleP

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Robert Haas
On Mon, Nov 4, 2019 at 2:04 PM Andres Freund wrote: > Is that really true? In the case where it started and failed we except > the error queue to have been attached to, and there to be either an > error 'E' or a 'X' response (cf HandleParallelMessage()). It doesn't > strike me as very complicated

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Stephen Frost
Greetings, * Andres Freund (and...@anarazel.de) wrote: > On 2019-10-09 12:29:18 -0400, Robert Haas wrote: > > I would say rather that if fork() is failing on your system, you have > > a not very stable system. > > I don't think that's really true, fwiw. It's often a good idea to turn > on strict

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Andres Freund
Hi, On 2019-11-04 12:14:53 -0500, Robert Haas wrote: > If a process trying to register workers finds out that no worker slots > are available, it discovers this at the time it tries to perform the > registration. But fork() failure happens later and in a different > process. The original process j

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Andres Freund
Hi, On 2019-10-09 12:29:18 -0400, Robert Haas wrote: > I would say rather that if fork() is failing on your system, you have > a not very stable system. I don't think that's really true, fwiw. It's often a good idea to turn on strict memory overcommit accounting, and with that set, it's actually

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Tom Lane
Alvaro Herrera writes: > On 2019-Nov-04, Robert Haas wrote: >> On Mon, Nov 4, 2019 at 10:42 AM Alvaro Herrera >> wrote: >>> I agree with this point in principle. Everything else (queries, >>> checkpointing) can fail, but it's critical that postmaster continues to >>> run [...] >> Sure, I'm not

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Alvaro Herrera
On 2019-Nov-04, Robert Haas wrote: > On Mon, Nov 4, 2019 at 10:42 AM Alvaro Herrera > wrote: > > > True, it's not a situation you especially want to be in. However, > > > I've lost count of the number of times that I've heard someone talk > > > about how their system was overstressed to the poi

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Robert Haas
On Mon, Nov 4, 2019 at 10:42 AM Alvaro Herrera wrote: > > True, it's not a situation you especially want to be in. However, > > I've lost count of the number of times that I've heard someone talk > > about how their system was overstressed to the point that everything > > else was failing, but Po

Re: Missed check for too-many-children in bgworker spawning

2019-11-04 Thread Alvaro Herrera
On 2019-Oct-09, Tom Lane wrote: > Robert Haas writes: > > On Wed, Oct 9, 2019 at 10:21 AM Tom Lane wrote: > >> We could improve on matters so far as the postmaster's child-process > >> arrays are concerned, by defining separate slot "pools" for the different > >> types of child processes. But I

Re: Missed check for too-many-children in bgworker spawning

2019-10-09 Thread Tom Lane
Robert Haas writes: > On Wed, Oct 9, 2019 at 10:21 AM Tom Lane wrote: >> We could improve on matters so far as the postmaster's child-process >> arrays are concerned, by defining separate slot "pools" for the different >> types of child processes. But I don't see much point if the code is >> not

Re: Missed check for too-many-children in bgworker spawning

2019-10-09 Thread Robert Haas
On Wed, Oct 9, 2019 at 10:21 AM Tom Lane wrote: > Well, that means we have a not-very-stable system then. > > We could improve on matters so far as the postmaster's child-process > arrays are concerned, by defining separate slot "pools" for the different > types of child processes. But I don't se

Re: Missed check for too-many-children in bgworker spawning

2019-10-09 Thread Tom Lane
Robert Haas writes: > On Mon, Oct 7, 2019 at 4:03 PM Tom Lane wrote: >> ... Moreover, we have to --- and already do, I trust --- deal with >> other resource-exhaustion errors in exactly the same code path, notably >> fork(2) failure which we simply can't predict or prevent. Doesn't the >> parall

Re: Missed check for too-many-children in bgworker spawning

2019-10-09 Thread Robert Haas
On Mon, Oct 7, 2019 at 4:03 PM Tom Lane wrote: > I'm not following your point? Whatever you might think the appropriate > response is, I'm pretty sure "elog(FATAL) out of the postmaster" is not > it. Moreover, we have to --- and already do, I trust --- deal with > other resource-exhaustion error

Re: Missed check for too-many-children in bgworker spawning

2019-10-07 Thread Tom Lane
Robert Haas writes: > On Sun, Oct 6, 2019 at 1:17 PM Tom Lane wrote: >> The attached proposed patch fixes this by making bgworker spawning >> include a canAcceptConnections() test. > I think it used to work this way -- not sure if it was ever committed > this way, but it at least did during deve

Re: Missed check for too-many-children in bgworker spawning

2019-10-07 Thread Robert Haas
On Sun, Oct 6, 2019 at 1:17 PM Tom Lane wrote: > Over in [1] we have a report of a postmaster shutdown that seems to > have occurred because some client logic was overaggressively spawning > connection requests, causing the postmaster's child-process arrays to > be temporarily full, and then some

Missed check for too-many-children in bgworker spawning

2019-10-06 Thread Tom Lane
Over in [1] we have a report of a postmaster shutdown that seems to have occurred because some client logic was overaggressively spawning connection requests, causing the postmaster's child-process arrays to be temporarily full, and then some parallel query tried to launch a new bgworker process.