Re: Parallel Inserts in CREATE TABLE AS

Zhihong Yu Wed, 09 Dec 2020 18:18:48 -0800

Hi,

+           if (!OidIsValid(col->collOid) &&
+               type_is_collatable(col->typeName->typeOid))
+               ereport(ERROR,
...
+           attrList = lappend(attrList, col);


Should attrList be freed when ereport is called ?

+       query->CTASParallelInsInfo &= CTAS_PARALLEL_INS_UNDEF;

Since CTAS_PARALLEL_INS_UNDEF is 0, isn't the above equivalent to assigning
the value of 0 ?

Cheers

On Wed, Dec 9, 2020 at 5:43 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Wed, Dec 9, 2020 at 10:16 AM Dilip Kumar <dilipbal...@gmail.com> wrote:
> >
> > On Tue, Dec 8, 2020 at 6:24 PM Hou, Zhijie <houzj.f...@cn.fujitsu.com>
> wrote:
> > >
> > > > > I'm not quite sure how to address this. Can we not allow the
> planner
> > > > > to consider that the select is for CTAS and check only after the
> > > > > planning is done for the Gather node and other checks?
> > > > >
> > > >
> > > > IIUC, you are saying that we should not influence the cost of gather
> node
> > > > even when the insertion would be done by workers? I think that
> should be
> > > > our fallback option anyway but that might miss some paths to be
> considered
> > > > parallel where the cost becomes more due to parallel_tuple_cost (aka
> tuple
> > > > transfer cost). I think the idea is we can avoid the tuple transfer
> cost
> > > > only when Gather is the top node because only at that time we can
> push
> > > > insertion down, right? How about if we have some way to detect the
> same
> > > > before calling generate_useful_gather_paths()? I think when we are
> calling
> > > > apply_scanjoin_target_to_paths() in grouping_planner(), if the
> > > > query_level is 1, it is for CTAS, and it doesn't have a chance to
> create
> > > > UPPER_REL (doesn't have grouping, order, limit, etc clause) then we
> can
> > > > probably assume that the Gather will be top_node. I am not sure
> about this
> > > > but I think it is worth exploring.
> > > >
> > >
> > > I took a look at the parallel insert patch and have the same idea.
> > > https://commitfest.postgresql.org/31/2844/
> > >
> > >          * Consider generating Gather or Gather Merge paths.  We must
> only do this
> > >          * if the relation is parallel safe, and we don't do it for
> child rels to
> > >          * avoid creating multiple Gather nodes within the same plan.
> We must do
> > >          * this after all paths have been generated and before
> set_cheapest, since
> > >          * one of the generated paths may turn out to be the cheapest
> one.
> > >          */
> > >         if (rel->consider_parallel && !IS_OTHER_REL(rel))
> > >                 generate_useful_gather_paths(root, rel, false);
> > >
> > > IMO Gatherpath created here seems the right one which can possible
> ignore parallel cost if in CTAS.
> > > But We need check the following parse option which will create path to
> be the parent of Gatherpath here.
> > >
> > > if (root->parse->rowMarks)
> > > if (limit_needed(root->parse))
> > > if (root->parse->sortClause)
> > > if (root->parse->distinctClause)
> > > if (root->parse->hasWindowFuncs)
> > > if (root->parse->groupClause || root->parse->groupingSets ||
> root->parse->hasAggs || root->root->hasHavingQual)
> > >
> >
> > Yeah, and as I pointed earlier, along with this we also need to
> > consider that the RelOptInfo must be the final target(top level rel).
> >
>
> Attaching v10 patch set that includes the change suggested above for
> ignoring parallel tuple cost and also few more test cases. I split the
> patch as per Amit's suggestion. v10-0001 contains parallel inserts
> code without planner tuple cost changes and test cases. v10-0002 has
> required changes for ignoring planner tuple cost calculations.
>
> Please review it further.
>
> After the review and addressing all the comments, I plan to make some
> code common so that it can be used for Parallel Inserts in REFRESH
> MATERIALIZED VIEW. Thoughts?
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>

Re: Parallel Inserts in CREATE TABLE AS

Reply via email to