On Wed, Apr 5, 2017 at 1:43 AM, Andres Freund <and...@anarazel.de> wrote:
> On 2017-04-04 08:01:32 -0400, Robert Haas wrote: > > On Tue, Apr 4, 2017 at 12:47 AM, Andres Freund <and...@anarazel.de> > wrote: > > > I don't think the parallel seqscan is comparable in complexity with the > > > parallel append case. Each worker there does the same kind of work, > and > > > if one of them is behind, it'll just do less. But correct sizing will > > > be more important with parallel-append, because with non-partial > > > subplans the work is absolutely *not* uniform. > > > > Sure, that's a problem, but I think it's still absolutely necessary to > > ramp up the maximum "effort" (in terms of number of workers) > > logarithmically. If you just do it by costing, the winning number of > > workers will always be the largest number that we think we'll be able > > to put to use - e.g. with 100 branches of relatively equal cost we'll > > pick 100 workers. That's not remotely sane. > > I'm quite unconvinced that just throwing a log() in there is the best > way to combat that. Modeling the issue of starting more workers through > tuple transfer, locking, startup overhead costing seems a better to me. > > If the goal is to compute the results of the query as fast as possible, > and to not use more than max_parallel_per_XXX, and it's actually > beneficial to use more workers, then we should. Because otherwise you > really can't use the resources available. > +1. I had expressed similar opinion earlier, but yours is better articulated. Thanks. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company