Hi Robert, On Fri, Feb 23, 2018 at 2:53 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Thu, Feb 8, 2018 at 8:05 AM, Jeevan Chalke > <jeevan.cha...@enterprisedb.com> wrote: > > In this attached version, I have rebased my changes over new design of > > partially_grouped_rel. The preparatory changes of adding > > partially_grouped_rel are in 0001. > > I spent today hacking in 0001; results attached. The big change from > your version is that this now uses generate_gather_paths() to add > Gather/Gather Merge nodes (except in the case where we sort by group > pathkeys and then Gather Merge) rather than keeping all of the bespoke > code. That turned up to be a bit less elegant than I would have liked > -- I had to an override_rows argument to generate_gather_paths to make > it work. But overall I think this is still a big improvement, since > it lets us share code instead of duplicating it. Also, it potentially > lets us add partially-aggregated but non-parallel paths into > partially_grouped_rel->pathlist and that should Just Work; they will > get the Finalize Aggregate step but not the Gather. With your > arrangement that wouldn't work. > > Please review/test. > I have reviewed and tested the patch and here are my couple of points: /* - * If the input rel belongs to a single FDW, so does the grouped rel. + * If the input rel belongs to a single FDW, so does the grouped rel. Same + * for the partially_grouped_rel. */ grouped_rel->serverid = input_rel->serverid; grouped_rel->userid = input_rel->userid; grouped_rel->useridiscurrent = input_rel->useridiscurrent; grouped_rel->fdwroutine = input_rel->fdwroutine; + partially_grouped_rel->serverid = input_rel->serverid; + partially_grouped_rel->userid = input_rel->userid; + partially_grouped_rel->useridiscurrent = input_rel->useridiscurrent; + partially_grouped_rel->fdwroutine = input_rel->fdwroutine; In my earlier mail where I have posted a patch for this partially grouped rel changes, I forgot to put my question on this. I was unclear about above changes and thus passed grouped_rel whenever we wanted to work on partially_grouped_rel to fetch relevant details. One idea I thought about is to memcpy the struct once we have set all required fields for grouped_rel so that we don't have to do similar stuff for partially_grouped_rel. --- + * Insert a Sort node, if required. But there's no point in + * sorting anything but the cheapest path. */ - if (root->group_pathkeys) + if (!pathkeys_contained_in(root->group_pathkeys, path->pathkeys)) + { + if (path != linitial(partially_grouped_rel->pathlist)) + continue; Paths in pathlist are added by add_path(). Though we have paths is pathlist is sorted with the cheapest total path, we generally use RelOptInfo->cheapest_total_path instead of using first entry, unlike partial paths. But here you use the first entry like partial paths case. Will it better to use cheapest total path from partially_grouped_rel? This will require calling set_cheapest on partially_grouped_rel before we call this function. Attached top-up patch doing this along with few indentation fixes. Rest of the changes look good to me. Once this gets in, I will re-base my other patches accordingly. And, thanks for committing 0006. Thanks > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 1c792a0..f6b0208 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -2453,7 +2453,8 @@ set_worktable_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) * we must do something.) */ void -generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_rows) +generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, + bool override_rows) { Path *cheapest_partial_path; Path *simple_gather_path; diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index e4f9bd4..e8f6cc5 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -6101,7 +6101,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, */ if (!pathkeys_contained_in(root->group_pathkeys, path->pathkeys)) { - if (path != linitial(partially_grouped_rel->pathlist)) + if (path != partially_grouped_rel->cheapest_total_path) continue; path = (Path *) create_sort_path(root, grouped_rel, @@ -6186,9 +6186,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel, */ if (partially_grouped_rel->pathlist) { - Path *path; - - path = (Path *) linitial(partially_grouped_rel->pathlist); + Path *path = partially_grouped_rel->cheapest_total_path; hashaggtablesize = estimate_hashagg_tablesize(path, agg_final_costs, @@ -6369,6 +6367,9 @@ add_paths_to_partial_grouping_rel(PlannerInfo *root, add_path(partially_grouped_rel, path); } + + /* Now choose the best path(s) */ + set_cheapest(partially_grouped_rel); } /* diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h index 2011f66..94f9bb2 100644 --- a/src/include/optimizer/paths.h +++ b/src/include/optimizer/paths.h @@ -54,7 +54,7 @@ extern RelOptInfo *standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels); extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel, - bool override_rows); + bool override_rows); extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages, double index_pages, int max_workers); extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,