Re: Missed check for too-many-children in bgworker spawning

Robert Haas Mon, 04 Nov 2019 09:15:54 -0800

On Mon, Nov 4, 2019 at 10:42 AM Alvaro Herrera <[email protected]> wrote:
> > True, it's not a situation you especially want to be in.  However,
> > I've lost count of the number of times that I've heard someone talk
> > about how their system was overstressed to the point that everything
> > else was failing, but Postgres kept chugging along.  That's a good
> > reputation to have and we shouldn't just walk away from it.
>
> I agree with this point in principle.  Everything else (queries,
> checkpointing) can fail, but it's critical that postmaster continues to
> run -- that way, once the high load episode is over, connections can be
> re-established as needed, auxiliary processes can be re-launched, and
> the system can be again working normally.  If postmaster dies, all bets
> are off.  Also: an idle postmaster is not using any resources; on its
> own, killing it or it dying would not free any useful resources for the
> system load to be back to low again.


Sure, I'm not arguing that the postmaster should blow up and die.

I was, however, arguing that if the postmaster fails to launch workers
for a parallel query due to process table exhaustion, it's OK for
*that query* to error out.

Tom finds that argument to be "utter bunkum," but I don't agree. I
think there might also be some implementation complexity there that is
more than meets the eye. If a process trying to register workers finds
out that no worker slots are available, it discovers this at the time
it tries to perform the registration. But fork() failure happens later
and in a different process. The original process just finds out that
the worker is "stopped," not whether or not it ever got started in the
first place. We certainly can't ignore a worker that managed to start
and then bombed out, because it might've already, for example, claimed
a block from a Parallel Seq Scan and not yet sent back the
corresponding tuples. We could ignore a worker that never started at
all, due to EAGAIN or whatever else, but the original process that
registered the worker has no way of finding this out.

Now you might think we could just fix that by having the postmaster
record something in the slot, but that doesn't work either, because
the slot could get reused before the original process checks the
status information. The fact that the slot has been reused is
sufficient evidence that the worker was unregistered, which means it
either stopped or we gave up on starting it, but it doesn't tell us
which one. To be able to tell that, we'd have to have a mechanism to
prevent slots from getting reused until any necessary exit status
information had bene read, sort of like the OS-level zombie process
mechanism (which we all love, I guess, and therefore definitely want
to reinvent...?). The postmaster logic would need to be made more
complicated, so that zombies couldn't accumulate: if a process asked
for status notifications, but then died, any zombies waiting for it
would need to be cleared. And you'd also have to make sure that a
process which didn't die was guaranteed to read the status from the
zombie to clear it, and that it did so in a reasonably timely fashion,
which is currently in no way guaranteed and does not appear at all
straightforward to guarantee.

And even if you solved for all of that, I think you might still find
that it breaks some parallel query (or parallel create index) code
that expects the number of workers to change at registration time, but
not afterwards. So, that could would all need to be adjusted.

In short, I think Tom wants a pony. But that does not mean we should
not fix this bug.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Missed check for too-many-children in bgworker spawning

Reply via email to