[ 
https://issues.apache.org/jira/browse/IMPALA-14061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18015436#comment-18015436
 ] 

ASF subversion and git services commented on IMPALA-14061:
----------------------------------------------------------

Commit 5244f6169e0529a4b40defca26b3aed266b0fee3 in impala's branch 
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5244f6169 ]

IMPALA-14061: Calcite Planner: added Calcite rules

This commit adds Calcite optimization rules to create more efficient
plans. These rules should be considered a work in progress.  These
were tested against a 3TB tpcds database so they are fairly efficient
as/is, but we can make improvements as we see them along the way.

Most of the changes have been added to the CalciteOptimizer file. There
are several phases of rules that are applied, which are as follows:

- expand nodes:  These rules change the plan to a plan that can be
handled by Impala. For instance, there are RelNodes such as
"LogicalIntersect" which are not directly applicable to the Impala
physical nodes so they need to be expanded.
- coerce nodes: This module changes the nodes so they have the
correct datatype values (e.g. literal strings in Calcite are char
but need to be varchar for Impala)
- optimize nodes: first pass on reordering the logical RelNode ordering.
- join: Squishes the join RelNodes together, pushes them into one
"multiJoin" and then lets Calcite's join optimizer reorder the joins
into a more optimal plan.  A note on this:  with this iteration,
statistics are still not being applied. This will come in with later
commits to make better plans.
- post join optimize nodes: Reruns the optimize nodes since the
join ordering may present new optimization opportunities
- pre Impala commit: Extra massaging after optimization that is
done at the end
- conversion to Impala RelNodes: Maps Calcite RelNodes into Impala
RelNodes which will then be mapped to Impala PlanNodes

In addition to this general change, there is also a change with
removing the "toCNF" rule. Calcite has multiple places where it
creates a SEARCH operator via "simplifying" the RexNodes within
various rules. This operator is not supported directly in Impala
and we need to call "expandSearch" to handle this. Because Impala
does this under the covers in the rules, this has been fixed
by overriding the RexBuilder (with ImpalaRexBuilder) and expanding
the SEARCH operator whenever it is called (sidenote: we could have
changed the rules that called simplify, but that would have resulted
in too much code duplication).

The toCNF rule was removed and placed as a call within the
CoerceOperandShuttle, which already manipulates all the RexNodes, so
all that code is now in one place.

Change-Id: I6671f7ed298a18965ef0b7a5fc10f4912333a52b
Reviewed-on: http://gerrit.cloudera.org:8080/22870
Reviewed-by: Aman Sinha <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Calcite Planner: add basic rules for performance
> ------------------------------------------------
>
>                 Key: IMPALA-14061
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14061
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Steve Carlin
>            Assignee: Steve Carlin
>            Priority: Major
>
> Let's get in some basic Calcite rules for helping performance on queries.  
> This Jira will be used for the first pass.  More rules will be added later on.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to