Re: [HACKERS] parallel.c oblivion of worker-startup failures

Amit Kapila Tue, 23 Jan 2018 18:46:01 -0800

On Wed, Jan 24, 2018 at 2:35 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Jan 23, 2018 at 11:15 AM, Robert Haas <robertmh...@gmail.com> wrote:
>> On Mon, Jan 22, 2018 at 10:56 PM, Amit Kapila <amit.kapil...@gmail.com> 
>> wrote:
>>
>> OK.  I've committed this patch and back-patched it to 9.6.
>> Back-patching to 9.5 didn't looks simple because nworkers_launched
>> doesn't exist on that branch, and it doesn't matter very much anyway
>> since the infrastructure has no in-core users in 9.5.
>
> I was just talking to Thomas, and one or the other of us realized that
> this doesn't completely fix the problem.  I think that if a worker
> fails, even by some unorthodox mechanism like an abrupt proc_exit(1),
> after the point where it attached to the error queue, this patch is
> sufficient to make that reliably turn into an error.  But what if it
> fails before that - e.g. fork() failure?  In that case, this process
> will catch the problem if the calling process calls
> WaitForParallelWorkersToFinish, but does that actually happen in any
> interesting cases?  Maybe not.
>
> In the case of Gather, for example, we won't
> WaitForParallelWorkersToFinish() until we've read all the tuples.  If
> there's a tuple queue that does not yet have a sender, then we'll just
> wait for it to get one, even if the sender fell victim to a fork
> failure and is never going to show up.
>


Hmm, I think that case will be addressed because tuple queues can
detect if the leader is not attached.  It does in code path
shm_mq_receive->shm_mq_counterparty_gone.  In
shm_mq_counterparty_gone, it can detect if the worker is gone by using
GetBackgroundWorkerPid.  Moreover, I have manually tested this
particular case before saying your patch is fine.  Do you have some
other case in mind which I am missing?

> We could *almost* fix this by having execParallel.c include a Barrier
> in the DSM, similar to what I proposed for parallel CREATE INDEX.
> When a worker starts, it attaches to the barrier; when it exits, it
> detaches.  Instead of looping until the leader is done and all the
> tuple queues are detached, Gather or Gather Merge could loop until the
> leader is done and everyone detaches from the barrier.  But that would
> require changes to the Barrier API, which isn't prepared to have the
> caller alternate waiting with other activity like reading the tuple
> queues, which would clearly be necessary in this case.  Moreover, it
> doesn't work at all when parallel_leader_participation=off, because in
> that case we'll again just wait forever for some worker to show up,
> and if none of them do, then we're hosed.
>
> One idea to fix this is to add a new function in parallel.c with a
> name like CheckForFailedWorkers() that Gather and Gather Merge call
> just before they WaitLatch().  If a worker fails to start, the
> postmaster will send SIGUSR1 to the leader, which will set the latch,
> so we can count on the function being run after every worker death.
> The function would go through and check each worker for which
> pcxt->worker[i].error_mqh != NULL && !pcxt->any_message_received[i] to
> see whether GetBackgroundWorkerPid(pcxt->worker[i].bgwhandle, &pid) ==
> BGWH_STOPPED.  If so, ERROR.
>
> This lets us *run* for arbitrary periods of time without detecting a
> fork failure, but before we *wait* we will notice.  Getting more
> aggressive than that sounds expensive, but I'm open to ideas.
>
> If we adopt this approach, then Peter's patch could probably make use
> of it, too.  It's a little tricky because ConditionVariableSleep()
> tries not to return spuriously, so we'd either have to change that or
> stick CheckForFailedWorkers() into that function's loop.  I suspect
> the former is a better approach.  Or maybe that patch should still use
> the Barrier approach and avoid all of this annoyance.
>

I think we can discuss your proposed solution once the problem is
clear.  As of now, it is not very clear to me whether there is any
pending problem.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] parallel.c oblivion of worker-startup failures

Reply via email to