On 14 November 2014 20:37, Jim Nasby <jim.na...@bluetreble.com> wrote:
> On 11/12/14, 1:54 AM, David Rowley wrote: > >> >> We'd also need to add some infrastructure to merge aggregate states >> together for this to work properly. This means that could also work for >> avg() and stddev etc. For max() and min() the merge functions would likely >> just be the same as the transition functions. >> > > Sanity check: what % of a large aggregate query fed by a seqscan actually > spent in the aggregate functions? Even if you look strictly at CPU cost, > isn't there more code involved to get data to the aggregate function than > in the aggregation itself, except maybe for numeric? > > You might be right, but that sounds like it would need all the parallel workers to send each matching tuple to a queue to be processed by some aggregate node. I guess this would have more advantage for wider tables or tables with many dead tuples, or if the query has quite a selective where clause, as less data would make it onto that queue. Perhaps I've taken 1 step too far forward here. I had been thinking that each worker would perform the partial seqscan and in the worker context pass the tuple down to the aggregate node. Then later once each worker had complete some other perhaps new node type (MergeAggregateStates) would merge all those intermediate agg states into the final agg state (which would then be ready for the final function to be called). Are there any plans for what will be in charge of deciding how many workers would be allocated to a parallel query? Will this be something that's done at planning time? Or should the planner just create a parallel friendly plan, iif the plan is costly enough and then just allow the executor decide how many workers to throw at the job based on how busy the system is with other tasks at execution time? Regards David Rowley