Re: Query running for very long time (server hanged) with parallel append

Kyotaro HORIGUCHI Mon, 05 Feb 2018 20:42:50 -0800

At Tue, 06 Feb 2018 13:34:19 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
<horiguchi.kyot...@lab.ntt.co.jp> wrote in 
<20180206.133419.02213593.horiguchi.kyot...@lab.ntt.co.jp>
> At Mon, 5 Feb 2018 15:29:27 +0530, Amit Khandekar <amitdkhan...@gmail.com> 
> wrote in <CAJ3gD9eFR8=kxjpybehe34ut9+ryet0vbhgefst79ezx3l9...@mail.gmail.com>
> > On 2 February 2018 at 20:46, Robert Haas <robertmh...@gmail.com> wrote:
> > > On Fri, Feb 2, 2018 at 1:43 AM, Amit Khandekar <amitdkhan...@gmail.com> 
> > > wrote:
> > >> The query is actually hanging because one of the workers is in a small
> > >> loop where it iterates over the subplans searching for unfinished
> > >> plans, and it never comes out of the loop (it's a bug which I am yet
> > >> to fix). And it does not make sense to keep CHECK_FOR_INTERRUPTS in
> > >> each iteration; it's a small loop that does not pass control to any
> > >> other functions .
> > >
> > > Uh, sounds like we'd better fix that bug.
> > 
> > 
> > The scenario is this : One of the workers w1 hasn't yet chosen the
> > first plan, and all the plans are already finished. So w1 has it's
> > node->as_whichplan equal to -1. So the below condition in
> > choose_next_subplan_for_worker() never becomes true for this worker :
> > 
> > if (pstate->pa_next_plan == node->as_whichplan)
> > {
> >    /* We've tried everything! */
> >    pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
> >    LWLockRelease(&pstate->pa_lock);
> >    return false;
> > }
> > 
> > What I think is : we should save the information about which plan we
> > started the search with, before the loop starts. So, we save the
> > starting plan value like this, before the loop starts:
> >    initial_plan = pstate->pa_next_plan;
> > 
> > And then break out of the loop when we come back to to this initial plan :
> > if (initial_plan == pstate->pa_next_plan)
> >    break;
> > 
> > Now, suppose it so happens that initial_plan is a non-partial plan.
> > And then when we wrap around to the first partial plan, we check
> > (initial_plan == pstate->pa_next_plan) which will never become true
> > because initial_plan is less than first_partial_plan.
> > 
> > So what I have done in the patch is : have a flag 'should_wrap_around'
> > indicating that we should not wrap around. This flag is true when
> > initial_plan is a non-partial plan, in which case we know that we will
> > have covered all plans by the time we reach the last plan. So break
> > from the loop if this flag is false, or if we have reached the initial
> > plan.
> > 
> > Attached is a patch that fixes this issue on the above lines.
> 
> The patch adds two new variables and always sets them but warp
> around can happen at most once per call so I think it is enough
> to arrange to stop at the wrap around time. Addition to that the
> patch is forgetting the case of no partial plan. The loop can
> overrun on pa_finished in the case.


Sorry, the patch works fine. But still the new variables don't
seem needed.

> I think something like the following would work.
> 
> @@ -485,6 +485,13 @@ choose_next_subplan_for_worker(AppendState *node)
>      {
>        /* Loop back to first partial plan. */
>        pstate->pa_next_plan = append->first_partial_plan;
> +
> +      /*
> +       * Arrange to bail out if we are trying to fetch the first partial
> +       * plan
> +       */
> +      if (node->as_whichplan < append->first_partial_plan)
> +        node->as_whichplan = append->first_partial_plan;
>      }
>      else
> 
> > >> But I am not sure about this : while the workers are at it, why the
> > >> backend that is waiting for the workers does not come out of the wait
> > >> state with a SIGINT. I guess the same issue has been discussed in the
> > >> mail thread that you pointed.
> > >
> > > Is it getting stuck here?
> > >
> > >     /*
> > >      * We can't finish transaction commit or abort until all of the 
> > > workers
> > >      * have exited.  This means, in particular, that we can't respond to
> > >      * interrupts at this stage.
> > >      */
> > >     HOLD_INTERRUPTS();
> > >     WaitForParallelWorkersToExit(pcxt);
> > >     RESUME_INTERRUPTS();
> > 
> > Actually the backend is getting stuck in
> > choose_next_subplan_for_leader(), in LWLockAcquire(), waiting for the
> > hanging worker to release the pstate->pa_lock. I think there is
> > nothing wrong in this, because it is assumed that LWLock wait is going
> > to be for very short tiime, and because of this bug, the lwlock waits
> > forever.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Query running for very long time (server hanged) with parallel append

Reply via email to