Anonymous Coward (846) has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23315 )

Change subject: IMPALA-14102: [part 3] Calcite Planner: Customize 
LoptOptimizeJoinRule
......................................................................


Patch Set 8:

(6 comments)

I mainly left comments on the Calcite part cause I can't really tell if the 
plan changes are good or bad. I assume that if the performance of the queries 
is good then its fine to move forward.

http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java:

http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@92
PS8, Line 92:  * - When attempting the check of whether to push down a join 
versus keeping the current
            :  *   join, the epsilon on the tiebreaker has been made a little 
wider. If it does go
            :  *   to the tiebreaker, it compares the "leftest most" 
cardinality on the 2 choices. This
            :  *   allows a more effective runtime filter to be used, if found.
When the cost is too close it makes sense to have a custom tiebraker logic if 
it can lead to better plans. Currently, the rowWidthCost method is used as a 
tiebraker but we could abstract this part and make it configurable by opening a 
PR in Calcite upstream. In the meantime we can keep this code as is here if it 
helps Impala produce better plans.


http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@97
PS8, Line 97:  * - "Complex" trees threw a bit of a wrench into runtime 
filtering. A "simple" tree is
            :  *   defined here as a tree with only projects and filters. If a 
tree is complex, the row
            :  *   count will have gone through some major transformation, and 
it isn't quite apparent
            :  *   based on the row count at the top level that a runtime 
filter could have a major
            :  *   impact. It especially caused problems when the complex tree 
was processed in between
            :  *   simple trees. Because the row count showed up low, a swap 
might occur with a simple
            :  *   tree where it shouldn't have. In practice, it helped 
extremely to delay the
            :  *   processing of complex trees until after all the simple trees 
have been processed.
>From the description I basically understand that we would like to ignore the 
>effects that certain "complex" operators have on row count.

One way to achieve this would be to use a custom metadata provider. For 
instance, instead of using RelMdRowCount we could define our own 
ImpalaJoinMdRowCount and handle each operator as we want during the join 
reordering phase.

Another way, would be to implement your own CostFunction 
(https://github.com/apache/calcite/blob/68c540b194385fde2b5b21338ac7bffcdb7c8679/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java#L2087)
 and plug it in.

Abstracting the costing logic from the algorithm it self is helpful for two 
reasons:
* No need to maintain a separate version of LoptOptimizeJoinRule in Impala
* Easier to migrate to another join enumeration algorithm in the future (e.g., 
DphypJoinReorderRule, MultiJoinOptimizeBushyRule)


http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@106
PS8, Line 106:  * - Similar to the last point, we also do not want to swap the 
left and right side
The decision on when to swap two inputs should also be configurable since not 
all projects want the smaller side to be on the right side. This is another 
change that makes sense to land in Calcite upstream.


http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@910
PS8, Line 910:       // We want to process complex trees last. There can 
potentially
             :       // be a scan that is eligible for a runtime filter but we 
should
             :       // process all the simple trees first because the runtime
             :       // filter may help a portion of the complex tree but not 
the whole
             :       // tree.  This is applicable to q95 of tpcds
             :       if (isSimpleTree(multiJoin, nextFactor) && (dimWeight != 
0)) {
             :         if (isComplexTree(multiJoin, factor)) {
             :           continue;
             :         }
             :       } else if (isComplexTree(multiJoin, nextFactor)) {
             :         if (isSimpleTree(multiJoin, factor) && dimWeight != 0) {
             :           nextFactor = factor;
             :           bestWeight = dimWeight;
             :           bestCardinality = cardinality;
             :           continue;
             :         }
             :       }
As I mentioned previously, it would be better to abstract this logic somewhere 
so that the core of this method remains unchanged.

Before the decision about the next/best factor was based on the weight (type of 
the join), and the estimated cardinality. Now there is third criterion that is 
the complexity of the tree that seems to prevail the rest.


http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@1999
PS8, Line 1999: // IMPALA CHANGE: Commented this out because LoptJoinTree.Leaf
              :     // is protected within Calcite and causes a compilation 
error.
              :     // It's probably rare enough that it won't cause too many 
issues
              :     // but this should be addressed at some point.
              :     /*
              :     if (selfJoin) {
              :       return !multiJoin.isLeftFactorInRemovableSelfJoin(
              :           ((LoptJoinTree.Leaf) left.getFactorTree()).getId());
              :     }
              :     */
The TPC-DS queries have lots of self-joins so not sure if this is a really rare 
event. I guess it is acceptable if Impala opts to ignore it but having it on or 
off may have an impact on the plans.


http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@2081
PS8, Line 2081:  /**
              :    * Check if the RelNode is a complex tree. One piece of 
tricky logic: It is possible
              :    * that the current RelNode is a Join RelNode and part of the 
RelNode tree that is
              :    * being built from this MultiJoin. So any joins at the top 
level are ok for this check.
              :    * This method just wants to ensure that the whole RelNode 
tree consists of
              :    * simple trees.
              :    */
              :   private static boolean isComplexTree(RelNode relNode, boolean 
ignoreJoin) {
              :     if (relNode instanceof HepRelVertex) {
              :       relNode = ((HepRelVertex)relNode).getCurrentRel();
              :     }
              :
              :     if ((relNode instanceof Aggregate) || (relNode instanceof 
Union)) {
              :       return true;
              :     }
              :
              :     if (relNode instanceof Join) {
              :       if (!ignoreJoin) {
              :         return true;
              :       }
              :     } else {
              :       ignoreJoin = false;
              :     }
              :
              :     for (RelNode input : relNode.getInputs()) {
              :       if (isComplexTree(input, ignoreJoin)) {
              :         return true;
              :       }
              :     }
              :     return false;
              :   }
Not sure if this method will stay but if it does you may be able to use 
org.apache.calcite.rel.metadata.BuiltInMetadata.NodeTypes



--
To view, visit http://gerrit.cloudera.org:8080/23315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I95bba7bafcc7dd86f17921c11740ac349daac6c7
Gerrit-Change-Number: 23315
Gerrit-PatchSet: 8
Gerrit-Owner: Steve Carlin <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Anonymous Coward (846)
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Fri, 19 Sep 2025 09:09:46 +0000
Gerrit-HasComments: Yes

Reply via email to