On Wed, Jan 6, 2021 at 2:09 PM Antonin Houska <a...@cybertec.at> wrote: > > Greg Nancarrow <gregn4...@gmail.com> wrote: > > > Posting an updated set of patches to address recent feedback: > > Following is my review. > .. > > v11-0003-Enable-parallel-INSERT-and-or-SELECT-for-INSERT-INTO.patch > ------------------------------------------------------------------- > > @@ -1021,12 +1039,15 @@ IsInParallelMode(void) > * Prepare for entering parallel mode, based on command-type. > */ > void > -PrepareParallelMode(CmdType commandType) > +PrepareParallelMode(CmdType commandType, bool isParallelModifyLeader) > { > Assert(!IsInParallelMode() || force_parallel_mode != > FORCE_PARALLEL_OFF); > > if (IsModifySupportedInParallelMode(commandType)) > { > + if (isParallelModifyLeader) > + (void) GetCurrentCommandId(true); > > I miss a comment here. I suppose this is to set currentCommandIdUsed, so that > the leader process gets a new commandId for the following statements in the > same transaction, and thus it can see the rows inserted by the parallel > workers? >
oh no, leader backend and worker backends must use the same commandId. I am also not sure if we need this because for Insert statements we already call GetCurrentCommandId(true) is standard_ExecutorStart. We don't want the rows visibility behavior for parallel-inserts any different than non-parallel ones. > If my understanding is correct, I think that the leader should not participate > in the execution of the Insert node, else it would use higher commandId than > the workers. That would be weird, although probably not data corruption. > Yeah, exactly this is the reason both leader and backends must use the same commandId. > I > wonder if parallel_leader_participation should be considered false for the > "Gather -> Insert -> ..." plans. > If what I said above is correct then this is moot. > > > @@ -208,7 +236,7 @@ ExecGather(PlanState *pstate) > } > > /* Run plan locally if no workers or enabled and not > single-copy. */ > - node->need_to_scan_locally = (node->nreaders == 0) > + node->need_to_scan_locally = (node->nworkers_launched <= 0) > || (!gather->single_copy && > parallel_leader_participation); > node->initialized = true; > } > > Is this change needed? The code just before this test indicates that nreaders > should be equal to nworkers_launched. > This change is required because we don't need to set up readers for parallel-insert unless there is a returning clause. See the below check a few lines before this change: - if (pcxt->nworkers_launched > 0) + if (pcxt->nworkers_launched > 0 && !(isParallelModifyLeader && !isParallelModifyWithReturning)) { I think this check could be simplified to if (pcxt->nworkers_launched > 0 && isParallelModifyWithReturning) or something like that. > > In grouping_planner(), this branch > > + /* Consider a supported parallel table-modification command */ > + if (IsModifySupportedInParallelMode(parse->commandType) && > + !inheritance_update && > + final_rel->consider_parallel && > + parse->rowMarks == NIL) > + { > > is very similar to creation of the non-parallel ModifyTablePaths - perhaps an > opportunity to move the common code into a new function. > +1. > > @@ -2401,6 +2494,13 @@ grouping_planner(PlannerInfo *root, bool > inheritance_update, > } > } > > + if (parallel_modify_partial_path_count > 0) > + { > + final_rel->rows = current_rel->rows; /* ??? why hasn't > this been > + > * set above somewhere ???? */ > + generate_useful_gather_paths(root, final_rel, false); > + } > + > extra.limit_needed = limit_needed(parse); > extra.limit_tuples = limit_tuples; > extra.count_est = count_est; > > A boolean variable (e.g. have_parallel_modify_paths) would suffice, there's no > need to count the paths using parallel_modify_partial_path_count. > Sounds sensible. > > @@ -252,6 +252,7 @@ set_plan_references(PlannerInfo *root, Plan *plan) > PlannerGlobal *glob = root->glob; > int rtoffset = list_length(glob->finalrtable); > ListCell *lc; > + Plan *finalPlan; > > /* > * Add all the query's RTEs to the flattened rangetable. The live > ones > @@ -302,7 +303,17 @@ set_plan_references(PlannerInfo *root, Plan *plan) > } > > /* Now fix the Plan tree */ > - return set_plan_refs(root, plan, rtoffset); > + finalPlan = set_plan_refs(root, plan, rtoffset); > + if (finalPlan != NULL && IsA(finalPlan, Gather)) > + { > + Plan *subplan = outerPlan(finalPlan); > + > + if (IsA(subplan, ModifyTable) && castNode(ModifyTable, > subplan)->returningLists != NULL) > + { > + finalPlan->targetlist = > copyObject(subplan->targetlist); > + } > + } > + return finalPlan; > } > > I'm not sure if the problem of missing targetlist should be handled here (BTW, > NIL is the constant for an empty list, not NULL). Obviously this is a > consequence of the fact that the ModifyTable node has no regular targetlist. > I think it is better to add comments along with this change. In this form, this looks quite hacky to me. > Actually I don't quite understand why (in the current master branch) the > targetlist initialized in set_plan_refs() > > /* > * Set up the visible plan targetlist as being the same as > * the first RETURNING list. This is for the use of > * EXPLAIN; the executor won't pay any attention to the > * targetlist. We postpone this step until here so that > * we don't have to do set_returning_clause_references() > * twice on identical targetlists. > */ > splan->plan.targetlist = copyObject(linitial(newRL)); > > is not used. Instead, ExecInitModifyTable() picks the first returning list > again: > > /* > * Initialize result tuple slot and assign its rowtype using the first > * RETURNING list. We assume the rest will look the same. > */ > mtstate->ps.plan->targetlist = (List *) > linitial(node->returningLists); > > So if you set the targetlist in create_modifytable_plan() (according to > best_path->returningLists), or even in create_modifytable_path(), and ensure > that it gets propagated to the Gather node (generate_gather_pahs currently > uses rel->reltarget), then you should no longer need to tweak > setrefs.c. This sounds worth investigating. > Moreover, ExecInitModifyTable() would no longer need to set the > targetlist. > I am not sure if we need to do anything about ExecInitModifyTable. If we want to unify what setrefs.c does with ExecInitModifyTable, then we can start a separate thread. Thanks for all the reviews. I would like to emphasize what I said earlier in this thread that it is better to first focus on Parallelising Selects for Insert (aka what v11-0001-Enable-parallel-SELECT-for-INSERT-INTO-.-SELECT does) as that in itself is a step towards achieving parallel inserts, doing both 0001 and 0003 at the same time can take much more time as both touches quite intricate parts of the code. -- With Regards, Amit Kapila.