On Wed, Mar 16, 2016 at 4:19 PM, David Rowley <david.row...@2ndquadrant.com> wrote: > > On 16 March 2016 at 15:04, Robert Haas <robertmh...@gmail.com> wrote: > > I don't think I'd be objecting if you made PartialAggref a real > > alternative to Aggref. But that's not what you've got here. A > > PartialAggref is just a wrapper around an underlying Aggref that > > changes the interpretation of it - and I think that's not a good idea. > > If you want to have Aggref and PartialAggref as truly parallel node > > types, that seems cool, and possibly better than what you've got here > > now. Alternatively, Aggref can do everything. But I don't think we > > should go with this wrapper concept. > > Ok, I've now gotten rid of the PartialAggref node, and I'm actually > quite happy with how it turned out. I made > search_indexed_tlist_for_partial_aggref() to follow-on the series of > other search_indexed_tlist_for_* functions and have made it behave the > same way, by returning the newly created Var instead of doing that in > fix_combine_agg_expr_mutator(), as the last version did. > > Thanks for the suggestion. > > New patch attached. >
Few assorted comments: 1. /* + * Determine if it's possible to perform aggregation in parallel using + * multiple worker processes. We can permit this when there's at least one + * partial_path in input_rel, but not if the query has grouping sets, + * (although this likely just requires a bit more thought). We must also + * ensure that any aggregate functions which are present in either the + * target list, or in the HAVING clause all support parallel mode. + */ + can_parallel = false; + + if ((parse->hasAggs || parse->groupClause != NIL) && + input_rel->partial_pathlist != NIL && + parse->groupingSets == NIL && + root->glob->parallelModeOK) I think here you need to use has_parallel_hazard() with second parameter as false to ensure expressions are parallel safe. glob->parallelModeOK flag indicates that there is no parallel unsafe expression, but it can still contain parallel restricted expression. 2. AggPath * create_agg_path(PlannerInfo *root, @@ -2397,9 +2399,11 @@ create_agg_path(PlannerInfo *root, List *groupClause, List *qual, const AggClauseCosts *aggcosts, - double numGroups) + double numGroups, + bool combineStates, + bool finalizeAggs) Don't you need to set parallel_aware flag in this function as we do for create_seqscan_path()? 3. postgres=# explain select count(*) from t1; QUERY PLAN -------------------------------------------------------------------------------- ------ Finalize Aggregate (cost=45420.57..45420.58 rows=1 width=8) -> Gather (cost=45420.35..45420.56 rows=2 width=8) Number of Workers: 2 -> Partial Aggregate (cost=44420.35..44420.36 rows=1 width=8) -> Parallel Seq Scan on t1 (cost=0.00..44107.88 rows=124988 wid th=0) (5 rows) Isn't it better to call it as Parallel Aggregate instead of Partial Aggregate. Initialy, we have kept Partial for seqscan, but later on we changed to Parallel Seq Scan, so I am not able to think why it is better to call Partial incase of Aggregates. 4. /* + * Likewise for any partial paths, although this case is more simple as + * we don't track the cheapest path. + */ + foreach(lc, current_rel->partial_pathlist) + { + Path *subpath = (Path *) lfirst(lc); + + Assert(subpath->param_info == NULL); + lfirst(lc) = apply_projection_to_path(root, current_rel, + subpath, scanjoin_target); + } + Can't we do this by teaching apply_projection_to_path() as done in the latest patch posted by me to push down the target list beneath workers [1]. 5. + /* + * If we find any aggs with an internal transtype then we must ensure + * that pointers to aggregate states are not passed to other processes, + * therefore we set the maximum degree to PAT_INTERNAL_ONLY. + */ + if (aggform->aggtranstype == INTERNALOID) + context->allowedtype = PAT_INTERNAL_ONLY; In the above comment, you have refered maximum degree which is not making much sense to me. If it is not a typo, then can you clarify the same. 6. + * fix_combine_agg_expr + * Like fix_upper_expr() but additionally adjusts the Aggref->args of + * Aggrefs so that they references the corresponding Aggref in the subplan. + */ +static Node * +fix_combine_agg_expr(PlannerInfo *root, + Node *node, + indexed_tlist *subplan_itlist, + Index newvarno, + int rtoffset) +{ + fix_upper_expr_context context; + + context.root = root; + context.subplan_itlist = subplan_itlist; + context.newvarno = newvarno; + context.rtoffset = rtoffset; + return fix_combine_agg_expr_mutator(node, &context); +} + +static Node * +fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context) Don't we want to handle the case of context->subplan_itlist->has_non_vars as it is handled in fix_upper_expr_mutator()? If not then, I think adding the reason for same in comments above function would be better. 7. tlist.c +} \ No newline at end of file There should be a new line at end of file. [1] - http://www.postgresql.org/message-id/CAA4eK1Jk8hm-2j-CKjvdd0CZTsdPX=edk_qhzc4689hq0xt...@mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com