Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Haisheng Yuan Mon, 13 Jan 2020 20:06:45 -0800

The example is valid if Calcite doesn't do normalization or preprocessing 
before going to VolcanoPlanner.
But many databases and big data systems will try to preprocess the expression 
(push down predicates etc.) so that expressions in the same group can share the 
logical properties, for most case if not all. You may argue that it should be 
cost based, e.g. evaluating filter early can still be bad. It is true, but how 
accurate will the statistics be, how accurate will the cost model be?


- Haisheng

------------------------------------------------------------------
发件人：Julian Hyde<[email protected]>
日　期：2020年01月13日 08:44:54
收件人：[email protected]<[email protected]>
主　题：Re: [DISCUSS] Proposal to add API to force rules matching specific rels

> MEMO group (RelSet) represents logically equivalent expressions.
> All the expressions in one group should share the same logical
> properties, e.g. functional dependency, constraint properties etc.
> But they are not sharing it. Computation is done for each individual operator.

It's good, and correct, that we compute for each individual operator.

Suppose that a RelSubset s contains RelNode r1 and we know that the
constraint x > 0 holds. Suppose that we also have r2 with constraint y
< 10, and we discover that r1 and r2 are equivalent and belong
together in s. Now we can safely say that both constraints (x > 0 and
y < 10) apply to both r1 and r2.

Deducing additional constraints in this way is a big win. The effort
to compute constraints for each RelNode is well-spent.

This kind of deduction applies to other logical properties (e.g.
unique keys) and it applies to RelSet as well as RelSubset.

Julian


On Sun, Jan 12, 2020 at 10:10 AM Roman Kondakov
<[email protected]> wrote:
>
> @Haisheng
>
> > Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to 
> > prune unused columns.
>
> I also noticed that in most cases Project-related rules took significant
> part of the planning time. But I didn't explore this problem yet.
>
> > MEMO group (RelSet) represents logically equivalent expressions. All the 
> > expressions in one group should share the same logical properties, e.g. 
> > functional dependency, constraint properties etc. But they are not sharing 
> > it. Computation is done for each individual operator.
>
> I thought the equivalence of logical properties within groups (RelSets)
> are implicit. For example, in RelSet#addInternal it is always verified
> that the new added node has the same type as other members of the set.
>
> Anyway I absolutely agree with you that problems with traits
> propagation, rules matching and other that you mentioned in the previous
> e-mails should be solved in the first place. We need first to make
> Volcano optimizer work right and only then make some improvements like
> search space pruning.
>
> I would love to join this work to improve Volcano planner. Looking
> forward for design doc.
>
>
> --
> Kind Regards
> Roman Kondakov
>
>
> On 11.01.2020 11:42, Haisheng Yuan wrote:
> > Currently, Calcite uses Project operator and all kinds of 
> > ProjectXXXTranposeRule to prune unused columns. Every operator's output 
> > columns use index to reference child operators' columns. If there is a 
> > Project operator with child operator of a Filter, if we push project down 
> > under Filter, we will have Project-Filter-Project-FilterInput. All the 
> > newly generated relnodes will trigger rule matches. e.g. If we already did 
> > ReduceExpressionRule on filter, but due to the new filter RexCall's input 
> > ref index changed, we have to apply ReduceExpressionRule on the new filter 
> > again, even there is nothing can be reduced. Similarly new operator 
> > transpose/merge rule will be triggered. This can trigger a lot of rule 
> > matches.
> >
> > MEMO group (RelSet) represents logically equivalent expressions. All the 
> > expressions in one group should share the same logical properties, e.g. 
> > functional dependency, constraint properties etc. But they are not sharing 
> > it. Computation is done for each individual operator.
> >
> > Without resolving those issue, space pruning won't help much.
> >
> > There are a lot of room for improvement. Hope the community can join the 
> > effort to make Calcite better.
> >
> > - Haisheng
> >
> > ------------------------------------------------------------------
> > 发件人：Roman Kondakov<[email protected]>
> > 日　期：2020年01月10日 19:39:51
> > 收件人：<[email protected]>
> > 主　题：Re: [DISCUSS] Proposal to add API to force rules matching specific rels
> >
> > @Haisheng, could you please clarify what you mean by these points?
> >
> >> - the poor-design of column pruning,
> >> - lack of group properties etc.
> >
> > I guess I'm not aware of these problems.
> >

Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Reply via email to