Re: [HACKERS] Parallel Append implementation

Ashutosh Bapat Tue, 04 Apr 2017 19:31:31 -0700

On Wed, Apr 5, 2017 at 1:43 AM, Andres Freund <[email protected]> wrote:


> On 2017-04-04 08:01:32 -0400, Robert Haas wrote:
> > On Tue, Apr 4, 2017 at 12:47 AM, Andres Freund <[email protected]>
> wrote:
> > > I don't think the parallel seqscan is comparable in complexity with the
> > > parallel append case.  Each worker there does the same kind of work,
> and
> > > if one of them is behind, it'll just do less.  But correct sizing will
> > > be more important with parallel-append, because with non-partial
> > > subplans the work is absolutely *not* uniform.
> >
> > Sure, that's a problem, but I think it's still absolutely necessary to
> > ramp up the maximum "effort" (in terms of number of workers)
> > logarithmically.  If you just do it by costing, the winning number of
> > workers will always be the largest number that we think we'll be able
> > to put to use - e.g. with 100 branches of relatively equal cost we'll
> > pick 100 workers.  That's not remotely sane.
>
> I'm quite unconvinced that just throwing a log() in there is the best
> way to combat that.  Modeling the issue of starting more workers through
> tuple transfer, locking, startup overhead costing seems a better to me.
>
> If the goal is to compute the results of the query as fast as possible,
> and to not use more than max_parallel_per_XXX, and it's actually
> beneficial to use more workers, then we should.  Because otherwise you
> really can't use the resources available.
>

+1. I had expressed similar opinion earlier, but yours is better
articulated. Thanks.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] Parallel Append implementation

Reply via email to