On Thu, Feb 1, 2018 at 1:11 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Jan 29, 2018 at 3:42 AM, Jeevan Chalke > <jeevan.cha...@enterprisedb.com> wrote: > > Attached new patch set and rebased it on latest HEAD. > > I strongly dislike add_single_path_to_append_rel. It adds branches > and complexity to code that is already very complex. Most > importantly, why are we adding paths to fields in > OtherUpperPathExtraData *extra instead of adding them to the path list > of some RelOptInfo? If we had an appropriate RelOptInfo to which we > could add the paths, then we could make this simpler. > > If I understand correctly, the reason you're doing it this way is > because we have no place to put partially-aggregated, non-partial > paths. If we only needed to worry about the parallel query case, we > could just build an append of partially-aggregated paths for each > child and stick it into the grouped rel's partial pathlist, just as we > already do for regular parallel aggregation. There's no reason why > add_paths_to_grouping_rel() needs to care about the difference a > Partial Aggregate on top of whatever and an Append each branch of > which is a Partial Aggregate on top of whatever. However, this won't > work for non-partial paths, because add_paths_to_grouping_rel() needs > to put paths into the grouped rel's pathlist -- and we can't mix > together partially-aggregated paths and fully-aggregated paths in the > same path list. > Yes. > > But, really, the way we're using grouped_rel->partial_pathlist right > now is an awful hack. What I'm thinking we could do is introduce a > new UpperRelationKind called UPPERREL_PARTIAL_GROUP_AGG, coming just > before UPPERREL_GROUP_AGG. Without partition-wise aggregate, > partially_grouped_rel's pathlist would always be NIL, and its partial > pathlist would be constructed using the logic in > add_partial_paths_to_grouping_rel, which would need renaming. Then, > add_paths_to_grouping_rel would use paths from input_rel when doing > non-parallel aggregation and paths from partially_grouped_rel when > doing parallel aggregation. This would eliminate the ugly > grouped_rel->partial_pathlist = NIL assignment at the bottom of > create_grouping_paths(), because the grouped_rel's partial_pathlist > would never have been (bogusly) populated in the first place, and > hence would not need to be reset. All of these changes could be made > via a preparatory patch. > I wrote a patch for this (on current HEAD) and attached separately here. Please have a look. I still not yet fully understand how we are going to pass those to the add_paths_to_append_rel(). I need to look it more deeply though. > Then the main patch needs to worry about four cases: > > 1. Parallel partition-wise aggregate, grouping key doesn't contain > partition key. This should just be a matter of adding additional > Append paths to partially_grouped_rel->partial_pathlist. The existing > code already knows how to stick a Gather and FinalizeAggregate step on > top of that, and I don't see why that logic would need any > modification or addition. An Append of child partial-grouping paths > should be producing the same output as a partial grouping of an > Append, except that the former case might produce more separate groups > that need to be merged; but that should be OK: we can just throw all > the paths into the same path list and let the cheapest one win. > For any partial aggregation we need to add finalization step after we are done with the APPEND i.e. post add_paths_to_append_rel(). Given that we need to replicate the logic of sticking Gather and FinalizeAggregate step at later stage. This is what exactly done in create_partition_agg_paths(). Am I missing something here? > 2. Parallel partition-wise aggregate, grouping key contains partition > key. For the most part, this is no different from case #1. We won't > have groups spanning different partitions in this case, but we might > have groups spanning different workers, so we still need a > FinalizeAggregate step. As an exception, Gather -> Parallel Append -> > [non-partial Aggregate path] would give us a way of doing aggregation > in parallel without a separate Finalize step. I'm not sure if we want > to consider that to be in scope for this patch. If we do, then we'd > add the Parallel Append path to grouped_rel->partial_pathlist. Then, > we could stick Gather (Merge) on top if it to produce a path for > grouped_rel->pathlist using generate_gather_paths(); alternatively, it > can be used by upper planning steps -- something we currently can't > ever make work with parallel aggregation. > > 3. Non-parallel partition-wise aggregate, grouping key contains > partition key. Build Append paths from the children of grouped_rel > and add them to grouped_rel->pathlist. > Yes. > > 3. Non-parallel partition-wise aggregate, grouping key doesn't contain > partition key. Build Append paths from the children of > partially_grouped_rel and add them to partially_grouped_rel->pathlist. > Also add code to generate paths for grouped_rel->pathlist by sticking > a FinalizeAggregate step on top of each path from > partially_grouped_rel->pathlist. > Yes, this is done in create_partition_agg_paths(). create_partition_agg_paths() basically adds gather path, if required and then finalizes it again if required. These steps are similar to that of add_paths_to_grouping_rel() counterpart which does gather + finalization. > Overall, what I'm trying to argue for here is making this feature look > less like its own separate thing and more like part of the general > mechanism we've already got: partial paths would turn into regular > paths via generate_gather_paths(), and partially aggregated paths > would turn into fully aggregated paths by adding FinalizeAggregate. > The existing special case that allows us to build a non-partial, fully > aggregated path from a partial, partially-aggregated path would be > preserved. > > I think this would probably eliminate some other problems I see in the > existing design as well. For example, create_partition_agg_paths > doesn't consider using Gather Merge, but that might be a win. Append path is always non-sorted and has no pathkeys. Thus Gather Merge over an Append path seems infeasible, isn't it? > With > the design above, I think you never need to call create_gather_path() > anywhere. In case #1, the existing code takes care of it. In the > special case mentioned under #2, if we chose to support that, > generate_gather_paths() would take care of it. Both of those places > already know about Gather Merge. > I don't understand how exactly, will have more careful look over this. > On another note, I found preferFullAgg to be wicked confusing. To > "prefer" something is to like it better, but be willing to accept > other options if the preference can't be accommodated. Here, it seems > like preferFullAgg = false prevents consideration of full aggregation. > So it's really more like allowFullAgg, or, maybe better, > try_full_aggregation. Also, try_partition_wise_grouping has a > variable isPartialAgg which is always ends up getting set to > !preferFullAgg. Having two Boolean variables which are always set to > the opposite of each other isn't good. To add to the confusion, the > code following the place where isPartialAgg is set sometimes refers to > isPartialAgg and sometimes refers to preferFullAgg. > I will have a look over this and commenting part. > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > Thanks -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 2a4e22b..8b8a567 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -186,15 +186,18 @@ static PathTarget *make_sort_input_target(PlannerInfo *root, static void adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel, List *targets, List *targets_contain_srfs); static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, - RelOptInfo *grouped_rel, PathTarget *target, + RelOptInfo *grouped_rel, + RelOptInfo *partially_grouped_rel, + PathTarget *target, PathTarget *partial_grouping_target, const AggClauseCosts *agg_costs, const AggClauseCosts *agg_final_costs, grouping_sets_data *gd, bool can_sort, bool can_hash, double dNumGroups, List *havingQual); -static void add_partial_paths_to_grouping_rel(PlannerInfo *root, +static void add_paths_to_partial_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, RelOptInfo *grouped_rel, + RelOptInfo *partial_grouped_rel, PathTarget *target, PathTarget *partial_grouping_target, AggClauseCosts *agg_partial_costs, @@ -3627,6 +3630,7 @@ create_grouping_paths(PlannerInfo *root, Query *parse = root->parse; Path *cheapest_path = input_rel->cheapest_total_path; RelOptInfo *grouped_rel; + RelOptInfo *partially_grouped_rel; PathTarget *partial_grouping_target = NULL; AggClauseCosts agg_partial_costs; /* parallel only */ AggClauseCosts agg_final_costs; /* parallel only */ @@ -3637,6 +3641,8 @@ create_grouping_paths(PlannerInfo *root, /* For now, do all work in the (GROUP_AGG, NULL) upperrel */ grouped_rel = fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL); + partially_grouped_rel = fetch_upper_rel(root, UPPERREL_PARTIAL_GROUP_AGG, + NULL); /* * If the input relation is not parallel-safe, then the grouped relation @@ -3817,7 +3823,8 @@ create_grouping_paths(PlannerInfo *root, &agg_final_costs); } - add_partial_paths_to_grouping_rel(root, input_rel, grouped_rel, target, + add_paths_to_partial_grouping_rel(root, input_rel, grouped_rel, + partially_grouped_rel, target, partial_grouping_target, &agg_partial_costs, &agg_final_costs, gd, can_sort, can_hash, @@ -3825,7 +3832,8 @@ create_grouping_paths(PlannerInfo *root, } /* Build final grouping paths */ - add_paths_to_grouping_rel(root, input_rel, grouped_rel, target, + add_paths_to_grouping_rel(root, input_rel, grouped_rel, + partially_grouped_rel, target, partial_grouping_target, agg_costs, &agg_final_costs, gd, can_sort, can_hash, dNumGroups, (List *) parse->havingQual); @@ -5859,7 +5867,9 @@ get_partitioned_child_rels_for_join(PlannerInfo *root, Relids join_relids) */ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, - RelOptInfo *grouped_rel, PathTarget *target, + RelOptInfo *grouped_rel, + RelOptInfo *partially_grouped_rel, + PathTarget *target, PathTarget *partial_grouping_target, const AggClauseCosts *agg_costs, const AggClauseCosts *agg_final_costs, @@ -5870,6 +5880,12 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, Path *cheapest_path = input_rel->cheapest_total_path; ListCell *lc; + /* + * Parallel aggregation's partial paths must be stored in a + * partially_grouped_rel and not in a grouped_rel. + */ + Assert(grouped_rel->partial_pathlist == NIL); + if (can_sort) { /* @@ -5945,10 +5961,13 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, * Now generate a complete GroupAgg Path atop of the cheapest partial * path. We can do this using either Gather or Gather Merge. */ - if (grouped_rel->partial_pathlist) + if (partially_grouped_rel->partial_pathlist) { - Path *path = (Path *) linitial(grouped_rel->partial_pathlist); - double total_groups = path->rows * path->parallel_workers; + Path *path; + double total_groups; + + path = (Path *) linitial(partially_grouped_rel->partial_pathlist); + total_groups = path->rows * path->parallel_workers; path = (Path *) create_gather_path(root, grouped_rel, @@ -5999,7 +6018,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, */ if (parse->groupClause != NIL && root->group_pathkeys != NIL) { - foreach(lc, grouped_rel->partial_pathlist) + foreach(lc, partially_grouped_rel->partial_pathlist) { Path *subpath = (Path *) lfirst(lc); Path *gmpath; @@ -6107,9 +6126,11 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, * again, we'll only do this if it looks as though the hash table * won't exceed work_mem. */ - if (grouped_rel->partial_pathlist) + if (partially_grouped_rel->partial_pathlist) { - Path *path = (Path *) linitial(grouped_rel->partial_pathlist); + Path *path; + + path = (Path *) linitial(partially_grouped_rel->partial_pathlist); hashaggtablesize = estimate_hashagg_tablesize(path, agg_final_costs, @@ -6143,15 +6164,16 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, } /* - * add_partial_paths_to_grouping_rel + * add_paths_to_partial_grouping_rel * * Add partial paths to grouping relation. These paths are not fully * aggregated; a FinalizeAggregate step is still required. */ static void -add_partial_paths_to_grouping_rel(PlannerInfo *root, +add_paths_to_partial_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, RelOptInfo *grouped_rel, + RelOptInfo *partially_grouped_rel, PathTarget *target, PathTarget *partial_grouping_target, AggClauseCosts *agg_partial_costs, @@ -6199,7 +6221,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root, -1.0); if (parse->hasAggs) - add_partial_path(grouped_rel, (Path *) + add_partial_path(partially_grouped_rel, (Path *) create_agg_path(root, grouped_rel, path, @@ -6211,7 +6233,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root, agg_partial_costs, dNumPartialGroups)); else - add_partial_path(grouped_rel, (Path *) + add_partial_path(partially_grouped_rel, (Path *) create_group_path(root, grouped_rel, path, @@ -6239,7 +6261,7 @@ add_partial_paths_to_grouping_rel(PlannerInfo *root, */ if (hashaggtablesize < work_mem * 1024L) { - add_partial_path(grouped_rel, (Path *) + add_partial_path(partially_grouped_rel, (Path *) create_agg_path(root, grouped_rel, cheapest_partial_path, diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 6bf68f3..a10b150 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -71,6 +71,8 @@ typedef struct AggClauseCosts typedef enum UpperRelationKind { UPPERREL_SETOP, /* result of UNION/INTERSECT/EXCEPT, if any */ + UPPERREL_PARTIAL_GROUP_AGG, /* result of partial grouping/aggregation, if + * any */ UPPERREL_GROUP_AGG, /* result of grouping/aggregation, if any */ UPPERREL_WINDOW, /* result of window functions, if any */ UPPERREL_DISTINCT, /* result of "SELECT DISTINCT", if any */