Anonymous Coward (846) has posted comments on this change. ( http://gerrit.cloudera.org:8080/23315 )
Change subject: IMPALA-14102: [part 3] Calcite Planner: Customize LoptOptimizeJoinRule ...................................................................... Patch Set 8: (6 comments) I mainly left comments on the Calcite part cause I can't really tell if the plan changes are good or bad. I assume that if the performance of the queries is good then its fine to move forward. http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java: http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@92 PS8, Line 92: * - When attempting the check of whether to push down a join versus keeping the current : * join, the epsilon on the tiebreaker has been made a little wider. If it does go : * to the tiebreaker, it compares the "leftest most" cardinality on the 2 choices. This : * allows a more effective runtime filter to be used, if found. When the cost is too close it makes sense to have a custom tiebraker logic if it can lead to better plans. Currently, the rowWidthCost method is used as a tiebraker but we could abstract this part and make it configurable by opening a PR in Calcite upstream. In the meantime we can keep this code as is here if it helps Impala produce better plans. http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@97 PS8, Line 97: * - "Complex" trees threw a bit of a wrench into runtime filtering. A "simple" tree is : * defined here as a tree with only projects and filters. If a tree is complex, the row : * count will have gone through some major transformation, and it isn't quite apparent : * based on the row count at the top level that a runtime filter could have a major : * impact. It especially caused problems when the complex tree was processed in between : * simple trees. Because the row count showed up low, a swap might occur with a simple : * tree where it shouldn't have. In practice, it helped extremely to delay the : * processing of complex trees until after all the simple trees have been processed. >From the description I basically understand that we would like to ignore the >effects that certain "complex" operators have on row count. One way to achieve this would be to use a custom metadata provider. For instance, instead of using RelMdRowCount we could define our own ImpalaJoinMdRowCount and handle each operator as we want during the join reordering phase. Another way, would be to implement your own CostFunction (https://github.com/apache/calcite/blob/68c540b194385fde2b5b21338ac7bffcdb7c8679/core/src/main/java/org/apache/calcite/rel/rules/LoptOptimizeJoinRule.java#L2087) and plug it in. Abstracting the costing logic from the algorithm it self is helpful for two reasons: * No need to maintain a separate version of LoptOptimizeJoinRule in Impala * Easier to migrate to another join enumeration algorithm in the future (e.g., DphypJoinReorderRule, MultiJoinOptimizeBushyRule) http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@106 PS8, Line 106: * - Similar to the last point, we also do not want to swap the left and right side The decision on when to swap two inputs should also be configurable since not all projects want the smaller side to be on the right side. This is another change that makes sense to land in Calcite upstream. http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@910 PS8, Line 910: // We want to process complex trees last. There can potentially : // be a scan that is eligible for a runtime filter but we should : // process all the simple trees first because the runtime : // filter may help a portion of the complex tree but not the whole : // tree. This is applicable to q95 of tpcds : if (isSimpleTree(multiJoin, nextFactor) && (dimWeight != 0)) { : if (isComplexTree(multiJoin, factor)) { : continue; : } : } else if (isComplexTree(multiJoin, nextFactor)) { : if (isSimpleTree(multiJoin, factor) && dimWeight != 0) { : nextFactor = factor; : bestWeight = dimWeight; : bestCardinality = cardinality; : continue; : } : } As I mentioned previously, it would be better to abstract this logic somewhere so that the core of this method remains unchanged. Before the decision about the next/best factor was based on the weight (type of the join), and the estimated cardinality. Now there is third criterion that is the complexity of the tree that seems to prevail the rest. http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@1999 PS8, Line 1999: // IMPALA CHANGE: Commented this out because LoptJoinTree.Leaf : // is protected within Calcite and causes a compilation error. : // It's probably rare enough that it won't cause too many issues : // but this should be addressed at some point. : /* : if (selfJoin) { : return !multiJoin.isLeftFactorInRemovableSelfJoin( : ((LoptJoinTree.Leaf) left.getFactorTree()).getId()); : } : */ The TPC-DS queries have lots of self-joins so not sure if this is a really rare event. I guess it is acceptable if Impala opts to ignore it but having it on or off may have an impact on the plans. http://gerrit.cloudera.org:8080/#/c/23315/8/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaLoptOptimizeJoinRule.java@2081 PS8, Line 2081: /** : * Check if the RelNode is a complex tree. One piece of tricky logic: It is possible : * that the current RelNode is a Join RelNode and part of the RelNode tree that is : * being built from this MultiJoin. So any joins at the top level are ok for this check. : * This method just wants to ensure that the whole RelNode tree consists of : * simple trees. : */ : private static boolean isComplexTree(RelNode relNode, boolean ignoreJoin) { : if (relNode instanceof HepRelVertex) { : relNode = ((HepRelVertex)relNode).getCurrentRel(); : } : : if ((relNode instanceof Aggregate) || (relNode instanceof Union)) { : return true; : } : : if (relNode instanceof Join) { : if (!ignoreJoin) { : return true; : } : } else { : ignoreJoin = false; : } : : for (RelNode input : relNode.getInputs()) { : if (isComplexTree(input, ignoreJoin)) { : return true; : } : } : return false; : } Not sure if this method will stay but if it does you may be able to use org.apache.calcite.rel.metadata.BuiltInMetadata.NodeTypes -- To view, visit http://gerrit.cloudera.org:8080/23315 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95bba7bafcc7dd86f17921c11740ac349daac6c7 Gerrit-Change-Number: 23315 Gerrit-PatchSet: 8 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Anonymous Coward (846) Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Fri, 19 Sep 2025 09:09:46 +0000 Gerrit-HasComments: Yes
