On Thu, Nov 12, 2015 at 12:09 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: > I'm now designing the parallel feature of Append... > > Here is one challenge. How do we determine whether each sub-plan > allows execution in the background worker context?
I've been thinking about these questions for a bit now, and I think we can work on improving Append in multiple phases. The attached patch shows what I have in mind for phase 1. Currently, if you set up an inheritance hierarchy, you might get an Append some of whose children are Gather nodes with Parallel Seq Scans under them, like this: Append -> Gather -> Parallel Seq Scan -> Gather -> Parallel Seq Scan -> Seq Scan This is a crappy plan. Each Gather node will try to launch its own bunch of workers, which sucks. The attached patch lets you get this plan instead: Gather -> Append -> Partial Seq Scan -> Partial Seq Scan -> Partial Seq Scan That's a much better plan. To make this work, this plan introduces a couple of new concepts. Each RelOptInfo gets a new partial_pathlist field, which stores paths that need to be gathered to produce a workable plan. Once we've populated the partial_pathlist with whatever partial paths we know how to generate, we can either push a Gather node on top of one of those partial paths to create a real path; or we can use those partial paths to build more partial paths. The current patch handles only the simplest case: we can build a partial path for an appendrel by appending a partial path for each member rel. But eventually I hope to handle joinrels this way as well: you can join a partial path with an ordinary path for form a partial path for the joinrel. This requires some way of figuring out how many workers to request for the append-path, so this patch also adds a parallel_degree field to the path object, which is 0 for non-parallel things and the number of workers that the path believes to be ideal otherwise. Right now, it just percolates the highest worker count of any child up to the appendrel, which might not be right, especially once the append node knows how to balance workers, but it seems like a reasonable place to start. > Type-A) It can be performed on background worker context and > picked up by multiple worker processes concurrently. > (e.g: Parallel SeqScan) > Type-B) It can be performed on background worker context but > cannot be picked up by multiple worker processes. > (e.g: non-parallel aware node) > Type-C) It should not be performed on background worker context. > (e.g: plan/path node with volatile functions) At the time that we're forming an AppendPath, we can identify what you're calling type-A paths very easily: they're the ones that are coming from the partial_pathlist. We can distinguish between type-B and type-C paths coming from the childrel's pathlist based on the childrel's consider_parallel flag: if it's false, it's type-C, else type-B. At some point, we might need a per-path flag to distinguish cases where a particular path is type-C even though some other plan for that relation might be type-B. But right now that case doesn't arise. Now, of course, it's not good enough to have this information available when we're generating the AppendPath; it has to survive until execution time. But that doesn't mean the paths need to be self-identifying. We could, of course, decide to make them so, and maybe that's the best design, but I'm thinking about another approach: suppose the append node itself, instead of having one list of child plans, has a list of type-A plans, a list of type-B plans, and a list of type-C plans. This actually seems quite convenient, because at execution time, you presumably want the leader to prioritize type-C plans over any of the others, since it's the only one that can execute them, and the workers to prioritize type-B plans, since they can only take one worker each and are thus prone to be the limiting factor on finishing the whole Append. Having the plans divided in advance between these three lists (say, restricted_plans, safe_plans, parallel_plans) makes that easy to implement. Incidentally, I think it's subtly wrong to think of the parallel_aware flag as telling you whether the plan can absorb multiple workers. That's not really what it's for. It's to tell you whether the plan is doing *something* parallel aware - that is, whether its Estimate, InitializeDSM, and InitializeWorker callbacks should do anything. For SeqScan, flipping parallel_aware actually does split the input among all the workers, but for Append it's probably just load balances and for other nodes it might be something else again. The term I'm using to indicate a path/plan that returns only a subset of the results in each worker is "partial". Whether or not a path is partial is, in the design embodied in this patch, indicated both by whether path->parallel_degree > 0 and whether the path is in rel->pathlist or rel->partial_pathlist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 012c14b..fe07176 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -1588,6 +1588,7 @@ _outPathInfo(StringInfo str, const Path *node) else _outBitmapset(str, NULL); WRITE_BOOL_FIELD(parallel_aware); + WRITE_INT_FIELD(parallel_degree); WRITE_FLOAT_FIELD(rows, "%.0f"); WRITE_FLOAT_FIELD(startup_cost, "%.2f"); WRITE_FLOAT_FIELD(total_cost, "%.2f"); @@ -1764,7 +1765,6 @@ _outGatherPath(StringInfo str, const GatherPath *node) _outPathInfo(str, (const Path *) node); WRITE_NODE_FIELD(subpath); - WRITE_INT_FIELD(num_workers); WRITE_BOOL_FIELD(single_copy); } @@ -1887,6 +1887,7 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node) WRITE_NODE_FIELD(reltargetlist); WRITE_NODE_FIELD(pathlist); WRITE_NODE_FIELD(ppilist); + WRITE_NODE_FIELD(partial_pathlist); WRITE_NODE_FIELD(cheapest_startup_path); WRITE_NODE_FIELD(cheapest_total_path); WRITE_NODE_FIELD(cheapest_unique_path); diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 1fdcae5..fdbe13f 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -72,6 +72,7 @@ static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, Index rti, RangeTblEntry *rte); static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte); +static void create_parallel_paths(PlannerInfo *root, RelOptInfo *rel); static void set_rel_consider_parallel(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte); static bool function_rte_parallel_ok(RangeTblEntry *rte); @@ -107,6 +108,7 @@ static void set_cte_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte); static void set_worktable_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte); +static void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel); static RelOptInfo *make_rel_from_joinlist(PlannerInfo *root, List *joinlist); static bool subquery_is_pushdown_safe(Query *subquery, Query *topquery, pushdown_safety_info *safetyInfo); @@ -612,7 +614,6 @@ static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) { Relids required_outer; - int parallel_threshold = 1000; /* * We don't support pushing join clauses into the quals of a seqscan, but @@ -624,39 +625,9 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) /* Consider sequential scan */ add_path(rel, create_seqscan_path(root, rel, required_outer, 0)); - /* Consider parallel sequential scan */ - if (rel->consider_parallel && rel->pages > parallel_threshold && - required_outer == NULL) - { - Path *path; - int parallel_degree = 1; - - /* - * Limit the degree of parallelism logarithmically based on the size - * of the relation. This probably needs to be a good deal more - * sophisticated, but we need something here for now. - */ - while (rel->pages > parallel_threshold * 3 && - parallel_degree < max_parallel_degree) - { - parallel_degree++; - parallel_threshold *= 3; - if (parallel_threshold >= PG_INT32_MAX / 3) - break; - } - - /* - * Ideally we should consider postponing the gather operation until - * much later, after we've pushed joins and so on atop the parallel - * sequential scan path. But we don't have the infrastructure for - * that yet, so just do this for now. - */ - path = create_seqscan_path(root, rel, required_outer, parallel_degree); - path = (Path *) - create_gather_path(root, rel, path, required_outer, - parallel_degree); - add_path(rel, path); - } + /* If appropriate, consider parallel sequential scan */ + if (rel->consider_parallel && required_outer == NULL) + create_parallel_paths(root, rel); /* Consider index scans */ create_index_paths(root, rel); @@ -666,6 +637,54 @@ set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) } /* + * create_parallel_paths + * Build parallel access paths for a plain relation + */ +static void +create_parallel_paths(PlannerInfo *root, RelOptInfo *rel) +{ + int parallel_threshold = 1000; + int parallel_degree = 1; + + /* + * If this relation is too small to be worth a parallel scan, just return + * without doing anything ... unless it's an inheritance child. In that case, + * we want to generate a parallel path here anyway. It might not be worthwhile + * just for this relation, but when combined with all of its inheritance siblings + * it may well pay off. + */ + if (rel->pages < parallel_threshold && rel->reloptkind == RELOPT_BASEREL) + return; + + /* + * Limit the degree of parallelism logarithmically based on the size of the + * relation. This probably needs to be a good deal more sophisticated, but we + * need something here for now. + */ + while (rel->pages > parallel_threshold * 3 && + parallel_degree < max_parallel_degree) + { + parallel_degree++; + parallel_threshold *= 3; + if (parallel_threshold >= PG_INT32_MAX / 3) + break; + } + + /* Add an unordered partial path based on a parallel sequential scan. */ + add_partial_path(rel, create_seqscan_path(root, rel, NULL, parallel_degree)); + + /* + * If this is a baserel, consider gathering any partial paths we may have + * just created. If we gathered an inheritance child, we could end up with + * a very large number of gather nodes, each trying to grab its own pool of + * workers, so don't do this in that case. Instead, we'll consider gathering + * partial paths for the appendrel. + */ + if (rel->reloptkind == RELOPT_BASEREL) + generate_gather_paths(root, rel); +} + +/* * set_tablesample_rel_size * Set size estimates for a sampled relation */ @@ -1039,6 +1058,8 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, List *live_childrels = NIL; List *subpaths = NIL; bool subpaths_valid = true; + List *partial_subpaths = NIL; + bool partial_subpaths_valid = true; List *all_child_pathkeys = NIL; List *all_child_outers = NIL; ListCell *l; @@ -1093,6 +1114,13 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, else subpaths_valid = false; + /* Same idea, but for a partial plan. */ + if (childrel->partial_pathlist != NIL) + partial_subpaths = accumulate_append_subpath(partial_subpaths, + linitial(childrel->partial_pathlist)); + else + partial_subpaths_valid = false; + /* * Collect lists of all the available path orderings and * parameterizations for all the children. We use these as a @@ -1164,7 +1192,38 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, * if we have zero or one live subpath due to constraint exclusion.) */ if (subpaths_valid) - add_path(rel, (Path *) create_append_path(rel, subpaths, NULL)); + add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0)); + + /* + * Consider an append of partial unordered, unparameterized partial paths. + */ + if (partial_subpaths_valid) + { + AppendPath *appendpath; + ListCell *lc; + int parallel_degree = 0; + + /* + * Decide what parallel degree to request for this append path. For + * now, we just use the maximum parallel degree of any member. It + * might be useful to use a higher number if the Append node were smart + * enough to spread out the workers, but it currently isn't. + */ + foreach (lc, partial_subpaths) + { + Path *path = lfirst(lc); + parallel_degree = Max(parallel_degree, path->parallel_degree); + } + Assert(parallel_degree > 0); + + /* Generate a partial append path. */ + appendpath = create_append_path(rel, partial_subpaths, NULL, + parallel_degree); + add_partial_path(rel, (Path *) appendpath); + + /* Consider gathering it. */ + generate_gather_paths(root, rel); + } /* * Also build unparameterized MergeAppend paths based on the collected @@ -1214,7 +1273,7 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, if (subpaths_valid) add_path(rel, (Path *) - create_append_path(rel, subpaths, required_outer)); + create_append_path(rel, subpaths, required_outer, 0)); } } @@ -1440,8 +1499,9 @@ set_dummy_rel_pathlist(RelOptInfo *rel) /* Discard any pre-existing paths; no further need for them */ rel->pathlist = NIL; + rel->partial_pathlist = NIL; - add_path(rel, (Path *) create_append_path(rel, NIL, NULL)); + add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0)); /* * We set the cheapest path immediately, to ensure that IS_DUMMY_REL() @@ -1844,6 +1904,35 @@ set_worktable_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte) } /* + * generate_gather_paths + * Generate parallel access paths for a relation by pushing a Gather on + * top of a partial path. + */ +static void +generate_gather_paths(PlannerInfo *root, RelOptInfo *rel) +{ + Path *cheapest_partial_path; + Path *simple_gather_path; + + /* If there are no partial paths, there's nothing to do here. */ + if (rel->partial_pathlist == NIL) + return; + + /* + * The output of Gather is currently always unsorted, so there's only one + * partial path of interest: the cheapest one. + * + * Eventually, we should have a Gather Merge operation that can merge multiple + * tuple streams together while preserving their ordering. We could usefully + * generate such a path from each partial path that has non-NIL pathkeys. + */ + cheapest_partial_path = linitial(rel->partial_pathlist); + simple_gather_path = (Path *) + create_gather_path(root, rel, cheapest_partial_path, NULL); + add_path(rel, simple_gather_path); +} + +/* * make_rel_from_joinlist * Build access paths using a "joinlist" to guide the join path search. * diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 990486c..9d65be9 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -186,8 +186,7 @@ clamp_row_est(double nrows) */ void cost_seqscan(Path *path, PlannerInfo *root, - RelOptInfo *baserel, ParamPathInfo *param_info, - int nworkers) + RelOptInfo *baserel, ParamPathInfo *param_info) { Cost startup_cost = 0; Cost run_cost = 0; @@ -232,8 +231,8 @@ cost_seqscan(Path *path, PlannerInfo *root, * This is almost certainly not exactly the right way to model this, so * this will probably need to be changed at some point... */ - if (nworkers > 0) - run_cost = run_cost / (nworkers + 0.5); + if (path->parallel_degree > 0) + run_cost = run_cost / (path->parallel_degree + 0.5); path->startup_cost = startup_cost; path->total_cost = startup_cost + run_cost; diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c index b2cc9f0..9b2b0b4 100644 --- a/src/backend/optimizer/path/joinrels.c +++ b/src/backend/optimizer/path/joinrels.c @@ -1069,9 +1069,10 @@ mark_dummy_rel(RelOptInfo *rel) /* Evict any previously chosen paths */ rel->pathlist = NIL; + rel->partial_pathlist = NIL; /* Set up the dummy path */ - add_path(rel, (Path *) create_append_path(rel, NIL, NULL)); + add_path(rel, (Path *) create_append_path(rel, NIL, NULL, 0)); /* Set or update cheapest_total_path and related fields */ set_cheapest(rel); diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 411b36c..95d95f1 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -1125,7 +1125,7 @@ create_gather_plan(PlannerInfo *root, GatherPath *best_path) gather_plan = make_gather(subplan->targetlist, NIL, - best_path->num_workers, + best_path->path.parallel_degree, best_path->single_copy, subplan); diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index 09c3244..166a41b 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -685,6 +685,151 @@ add_path_precheck(RelOptInfo *parent_rel, return true; } +/* + * add_partial_path + * Like add_path, our goal here is to consider whether a path is worthy + * of being kept around, but the considerations here are a bit different. + * A partial path is one which can be executed in any number of workers in + * parallel such that each worker will generate a subset of the path's + * overall result. + * + * We don't generate parameterized partial paths because they seem unlikely + * ever to be worthwhile. The only way we could ever use such a path is + * by executing a nested loop with a complete path on the outer side - thus, + * each worker would scan the entire outer relation - and the partial path + * on the inner side - thus, each worker would scan only part of the inner + * relation. This is silly: a parameterized path is generally going to be + * based on an index scan, and we can't generate a partial path for that. + * And it's generally only going to produce a few rows, so splitting them + * up between workers doesn't really make sense. It would be better to + * use an unparameterized partial path on the outer side of the join with + * a parameterized complete path on the inner side in virtually every case. + * + * Because we don't need to consider parameterized paths here, we also don't + * need to consider the row counts as a measure of quality: every path will + * produce the same number of rows. Neither do we need to consider startup + * costs: parallelism is only used for plans that will be run to completion. + * Therefore, this routine is much simpler than add_path: it needs to + * consider only pathkeys and total cost. + */ +void +add_partial_path(RelOptInfo *parent_rel, Path *new_path) +{ + bool accept_new = true; /* unless we find a superior old path */ + ListCell *insert_after = NULL; /* where to insert new item */ + ListCell *p1; + ListCell *p1_prev; + ListCell *p1_next; + + /* Check for query cancel. */ + CHECK_FOR_INTERRUPTS(); + + /* + * As in add_path, throw out any paths which are dominated by the new path, + * but throw out the new path if some existing path dominates it. + */ + p1_prev = NULL; + for (p1 = list_head(parent_rel->partial_pathlist); p1 != NULL; + p1 = p1_next) + { + Path *old_path = (Path *) lfirst(p1); + bool remove_old = false; /* unless new proves superior */ + PathKeysComparison keyscmp; + + p1_next = lnext(p1); + + /* Compare pathkeys. */ + keyscmp = compare_pathkeys(new_path->pathkeys, old_path->pathkeys); + + /* Unless pathkeys are incompable, keep just one of the two paths. */ + if (keyscmp != PATHKEYS_DIFFERENT) + { + if (new_path->total_cost > old_path->total_cost * STD_FUZZ_FACTOR) + { + /* New path costs more; keep it only if pathkeys are better. */ + if (keyscmp != PATHKEYS_BETTER1) + accept_new = false; + } + else if (old_path->total_cost > new_path->total_cost + * STD_FUZZ_FACTOR) + { + /* Old path costs more; keep it only if pathkeys are better. */ + if (keyscmp != PATHKEYS_BETTER2) + remove_old = true; + } + else if (keyscmp == PATHKEYS_BETTER1) + { + /* Costs are about the same, new path has better pathkeys. */ + remove_old = true; + } + else if (keyscmp == PATHKEYS_BETTER2) + { + /* Costs are about the same, old path has better pathkeys. */ + accept_new = false; + } + else if (old_path->total_cost > new_path->total_cost * 1.0000000001) + { + /* Pathkeys are the same, and the old path costs more. */ + remove_old = true; + } + else + { + /* + * Pathkeys are the same, and new path isn't materially + * cheaper. + */ + accept_new = false; + } + } + + /* + * Remove current element from partial_pathlist if dominated by new. + */ + if (remove_old) + { + parent_rel->partial_pathlist = + list_delete_cell(parent_rel->partial_pathlist, p1, p1_prev); + /* add_path has a special case for IndexPath; we don't need it */ + Assert(!IsA(old_path, IndexPath)); + pfree(old_path); + /* p1_prev does not advance */ + } + else + { + /* new belongs after this old path if it has cost >= old's */ + if (new_path->total_cost >= old_path->total_cost) + insert_after = p1; + /* p1_prev advances */ + p1_prev = p1; + } + + /* + * If we found an old path that dominates new_path, we can quit + * scanning the partial_pathlist; we will not add new_path, and we + * assume new_path cannot dominate any later path. + */ + if (!accept_new) + break; + } + + if (accept_new) + { + /* Accept the new path: insert it at proper place */ + if (insert_after) + lappend_cell(parent_rel->partial_pathlist, insert_after, new_path); + else + parent_rel->partial_pathlist = + lcons(new_path, parent_rel->partial_pathlist); + } + else + { + /* add_path has a special case for IndexPath; we don't need it */ + Assert(!IsA(new_path, IndexPath)); + /* Reject and recycle the new path */ + pfree(new_path); + } +} + /***************************************************************************** * PATH NODE CREATION ROUTINES @@ -697,7 +842,7 @@ add_path_precheck(RelOptInfo *parent_rel, */ Path * create_seqscan_path(PlannerInfo *root, RelOptInfo *rel, - Relids required_outer, int nworkers) + Relids required_outer, int parallel_degree) { Path *pathnode = makeNode(Path); @@ -705,10 +850,11 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->parent = rel; pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); - pathnode->parallel_aware = nworkers > 0 ? true : false; + pathnode->parallel_aware = parallel_degree > 0 ? true : false; + pathnode->parallel_degree = parallel_degree; pathnode->pathkeys = NIL; /* seqscan has unordered result */ - cost_seqscan(pathnode, root, rel, pathnode->param_info, nworkers); + cost_seqscan(pathnode, root, rel, pathnode->param_info); return pathnode; } @@ -727,6 +873,7 @@ create_samplescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = NIL; /* samplescan has unordered result */ cost_samplescan(pathnode, root, rel, pathnode->param_info); @@ -781,6 +928,7 @@ create_index_path(PlannerInfo *root, pathnode->path.param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = pathkeys; /* Convert clauses to indexquals the executor can handle */ @@ -827,6 +975,7 @@ create_bitmap_heap_path(PlannerInfo *root, pathnode->path.param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = NIL; /* always unordered */ pathnode->bitmapqual = bitmapqual; @@ -853,6 +1002,7 @@ create_bitmap_and_path(PlannerInfo *root, pathnode->path.parent = rel; pathnode->path.param_info = NULL; /* not used in bitmap trees */ pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = NIL; /* always unordered */ pathnode->bitmapquals = bitmapquals; @@ -878,6 +1028,7 @@ create_bitmap_or_path(PlannerInfo *root, pathnode->path.parent = rel; pathnode->path.param_info = NULL; /* not used in bitmap trees */ pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = NIL; /* always unordered */ pathnode->bitmapquals = bitmapquals; @@ -903,6 +1054,7 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals, pathnode->path.param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = NIL; /* always unordered */ pathnode->tidquals = tidquals; @@ -921,7 +1073,8 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals, * Note that we must handle subpaths = NIL, representing a dummy access path. */ AppendPath * -create_append_path(RelOptInfo *rel, List *subpaths, Relids required_outer) +create_append_path(RelOptInfo *rel, List *subpaths, Relids required_outer, + int parallel_degree) { AppendPath *pathnode = makeNode(AppendPath); ListCell *l; @@ -931,6 +1084,7 @@ create_append_path(RelOptInfo *rel, List *subpaths, Relids required_outer) pathnode->path.param_info = get_appendrel_parampathinfo(rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = parallel_degree; pathnode->path.pathkeys = NIL; /* result is always considered * unsorted */ pathnode->subpaths = subpaths; @@ -985,6 +1139,7 @@ create_merge_append_path(PlannerInfo *root, pathnode->path.param_info = get_appendrel_parampathinfo(rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = pathkeys; pathnode->subpaths = subpaths; @@ -1060,6 +1215,7 @@ create_result_path(List *quals) pathnode->path.parent = NULL; pathnode->path.param_info = NULL; /* there are no other rels... */ pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = NIL; pathnode->quals = quals; @@ -1094,6 +1250,7 @@ create_material_path(RelOptInfo *rel, Path *subpath) pathnode->path.parent = rel; pathnode->path.param_info = subpath->param_info; pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = subpath->pathkeys; pathnode->subpath = subpath; @@ -1155,6 +1312,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, pathnode->path.parent = rel; pathnode->path.param_info = subpath->param_info; pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; /* * Assume the output is unsorted, since we don't necessarily have pathkeys @@ -1328,7 +1486,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, */ GatherPath * create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, - Relids required_outer, int nworkers) + Relids required_outer) { GatherPath *pathnode = makeNode(GatherPath); @@ -1336,11 +1494,18 @@ create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, pathnode->path.parent = rel; pathnode->path.param_info = get_baserel_parampathinfo(root, rel, required_outer); - pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = subpath->parallel_degree; pathnode->path.pathkeys = NIL; /* Gather has unordered result */ pathnode->subpath = subpath; - pathnode->num_workers = nworkers; + pathnode->single_copy = false; + + if (pathnode->path.parallel_degree == 0) + { + pathnode->path.parallel_degree = 1; + pathnode->path.pathkeys = subpath->pathkeys; + pathnode->single_copy = true; + } cost_gather(pathnode, root, rel, pathnode->path.param_info); @@ -1393,6 +1558,7 @@ create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = pathkeys; cost_subqueryscan(pathnode, root, rel, pathnode->param_info); @@ -1416,6 +1582,7 @@ create_functionscan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = pathkeys; cost_functionscan(pathnode, root, rel, pathnode->param_info); @@ -1439,6 +1606,7 @@ create_valuesscan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = NIL; /* result is always unordered */ cost_valuesscan(pathnode, root, rel, pathnode->param_info); @@ -1461,6 +1629,7 @@ create_ctescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer) pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = NIL; /* XXX for now, result is always unordered */ cost_ctescan(pathnode, root, rel, pathnode->param_info); @@ -1484,6 +1653,7 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->parallel_aware = false; + pathnode->parallel_degree = 0; pathnode->pathkeys = NIL; /* result is always unordered */ /* Cost is the same as for a regular CTE scan */ @@ -1516,6 +1686,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel, pathnode->path.param_info = get_baserel_parampathinfo(root, rel, required_outer); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.rows = rows; pathnode->path.startup_cost = startup_cost; pathnode->path.total_cost = total_cost; @@ -1651,6 +1822,7 @@ create_nestloop_path(PlannerInfo *root, required_outer, &restrict_clauses); pathnode->path.parallel_aware = false; + pathnode->path.parallel_degree = 0; pathnode->path.pathkeys = pathkeys; pathnode->jointype = jointype; pathnode->outerjoinpath = outer_path; @@ -1709,6 +1881,7 @@ create_mergejoin_path(PlannerInfo *root, required_outer, &restrict_clauses); pathnode->jpath.path.parallel_aware = false; + pathnode->jpath.path.parallel_degree = 0; pathnode->jpath.path.pathkeys = pathkeys; pathnode->jpath.jointype = jointype; pathnode->jpath.outerjoinpath = outer_path; @@ -1766,6 +1939,7 @@ create_hashjoin_path(PlannerInfo *root, required_outer, &restrict_clauses); pathnode->jpath.path.parallel_aware = false; + pathnode->jpath.path.parallel_degree = 0; /* * A hashjoin never has pathkeys, since its output ordering is diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index 996b7fe..8d7ac48 100644 --- a/src/backend/optimizer/util/relnode.c +++ b/src/backend/optimizer/util/relnode.c @@ -107,6 +107,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind) rel->reltargetlist = NIL; rel->pathlist = NIL; rel->ppilist = NIL; + rel->partial_pathlist = NIL; rel->cheapest_startup_path = NULL; rel->cheapest_total_path = NULL; rel->cheapest_unique_path = NULL; @@ -369,6 +370,7 @@ build_join_rel(PlannerInfo *root, joinrel->reltargetlist = NIL; joinrel->pathlist = NIL; joinrel->ppilist = NIL; + joinrel->partial_pathlist = NIL; joinrel->cheapest_startup_path = NULL; joinrel->cheapest_total_path = NULL; joinrel->cheapest_unique_path = NULL; diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 9a0dd28..bdf4c53 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -458,6 +458,7 @@ typedef struct RelOptInfo List *reltargetlist; /* Vars to be output by scan of relation */ List *pathlist; /* Path structures */ List *ppilist; /* ParamPathInfos used in pathlist */ + List *partial_pathlist; /* partial Paths */ struct Path *cheapest_startup_path; struct Path *cheapest_total_path; struct Path *cheapest_unique_path; @@ -755,6 +756,7 @@ typedef struct Path RelOptInfo *parent; /* the relation this path can build */ ParamPathInfo *param_info; /* parameterization info, or NULL if none */ bool parallel_aware; /* engage parallel-aware logic? */ + int parallel_degree; /* desired parallel degree; 0 = not parallel */ /* estimated size/costs for path (see costsize.c for more info) */ double rows; /* estimated number of result tuples */ @@ -1057,7 +1059,6 @@ typedef struct GatherPath { Path path; Path *subpath; /* path for each worker */ - int num_workers; /* number of workers sought to help */ bool single_copy; /* path must not be executed >1x */ } GatherPath; diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index ac21a3a..25a7303 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -72,7 +72,7 @@ extern double clamp_row_est(double nrows); extern double index_pages_fetched(double tuples_fetched, BlockNumber pages, double index_pages, PlannerInfo *root); extern void cost_seqscan(Path *path, PlannerInfo *root, RelOptInfo *baserel, - ParamPathInfo *param_info, int nworkers); + ParamPathInfo *param_info); extern void cost_samplescan(Path *path, PlannerInfo *root, RelOptInfo *baserel, ParamPathInfo *param_info); extern void cost_index(IndexPath *path, PlannerInfo *root, diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h index f28b4e2..38d4859 100644 --- a/src/include/optimizer/pathnode.h +++ b/src/include/optimizer/pathnode.h @@ -29,9 +29,10 @@ extern void add_path(RelOptInfo *parent_rel, Path *new_path); extern bool add_path_precheck(RelOptInfo *parent_rel, Cost startup_cost, Cost total_cost, List *pathkeys, Relids required_outer); +extern void add_partial_path(RelOptInfo *parent_rel, Path *new_path); extern Path *create_seqscan_path(PlannerInfo *root, RelOptInfo *rel, - Relids required_outer, int nworkers); + Relids required_outer, int parallel_degree); extern Path *create_samplescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer); extern IndexPath *create_index_path(PlannerInfo *root, @@ -59,7 +60,7 @@ extern BitmapOrPath *create_bitmap_or_path(PlannerInfo *root, extern TidPath *create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals, Relids required_outer); extern AppendPath *create_append_path(RelOptInfo *rel, List *subpaths, - Relids required_outer); + Relids required_outer, int parallel_degree); extern MergeAppendPath *create_merge_append_path(PlannerInfo *root, RelOptInfo *rel, List *subpaths, @@ -70,8 +71,7 @@ extern MaterialPath *create_material_path(RelOptInfo *rel, Path *subpath); extern UniquePath *create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath, SpecialJoinInfo *sjinfo); extern GatherPath *create_gather_path(PlannerInfo *root, - RelOptInfo *rel, Path *subpath, Relids required_outer, - int nworkers); + RelOptInfo *rel, Path *subpath, Relids required_outer); extern Path *create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel, List *pathkeys, Relids required_outer); extern Path *create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers