Hi, + if (!OidIsValid(col->collOid) && + type_is_collatable(col->typeName->typeOid)) + ereport(ERROR, ... + attrList = lappend(attrList, col);
Should attrList be freed when ereport is called ? + query->CTASParallelInsInfo &= CTAS_PARALLEL_INS_UNDEF; Since CTAS_PARALLEL_INS_UNDEF is 0, isn't the above equivalent to assigning the value of 0 ? Cheers On Wed, Dec 9, 2020 at 5:43 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > On Wed, Dec 9, 2020 at 10:16 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie <houzj.f...@cn.fujitsu.com> > wrote: > > > > > > > > I'm not quite sure how to address this. Can we not allow the > planner > > > > > to consider that the select is for CTAS and check only after the > > > > > planning is done for the Gather node and other checks? > > > > > > > > > > > > > IIUC, you are saying that we should not influence the cost of gather > node > > > > even when the insertion would be done by workers? I think that > should be > > > > our fallback option anyway but that might miss some paths to be > considered > > > > parallel where the cost becomes more due to parallel_tuple_cost (aka > tuple > > > > transfer cost). I think the idea is we can avoid the tuple transfer > cost > > > > only when Gather is the top node because only at that time we can > push > > > > insertion down, right? How about if we have some way to detect the > same > > > > before calling generate_useful_gather_paths()? I think when we are > calling > > > > apply_scanjoin_target_to_paths() in grouping_planner(), if the > > > > query_level is 1, it is for CTAS, and it doesn't have a chance to > create > > > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we > can > > > > probably assume that the Gather will be top_node. I am not sure > about this > > > > but I think it is worth exploring. > > > > > > > > > > I took a look at the parallel insert patch and have the same idea. > > > https://commitfest.postgresql.org/31/2844/ > > > > > > * Consider generating Gather or Gather Merge paths. We must > only do this > > > * if the relation is parallel safe, and we don't do it for > child rels to > > > * avoid creating multiple Gather nodes within the same plan. > We must do > > > * this after all paths have been generated and before > set_cheapest, since > > > * one of the generated paths may turn out to be the cheapest > one. > > > */ > > > if (rel->consider_parallel && !IS_OTHER_REL(rel)) > > > generate_useful_gather_paths(root, rel, false); > > > > > > IMO Gatherpath created here seems the right one which can possible > ignore parallel cost if in CTAS. > > > But We need check the following parse option which will create path to > be the parent of Gatherpath here. > > > > > > if (root->parse->rowMarks) > > > if (limit_needed(root->parse)) > > > if (root->parse->sortClause) > > > if (root->parse->distinctClause) > > > if (root->parse->hasWindowFuncs) > > > if (root->parse->groupClause || root->parse->groupingSets || > root->parse->hasAggs || root->root->hasHavingQual) > > > > > > > Yeah, and as I pointed earlier, along with this we also need to > > consider that the RelOptInfo must be the final target(top level rel). > > > > Attaching v10 patch set that includes the change suggested above for > ignoring parallel tuple cost and also few more test cases. I split the > patch as per Amit's suggestion. v10-0001 contains parallel inserts > code without planner tuple cost changes and test cases. v10-0002 has > required changes for ignoring planner tuple cost calculations. > > Please review it further. > > After the review and addressing all the comments, I plan to make some > code common so that it can be used for Parallel Inserts in REFRESH > MATERIALIZED VIEW. Thoughts? > > With Regards, > Bharath Rupireddy. > EnterpriseDB: http://www.enterprisedb.com >