Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23313 )
Change subject: IMPALA-14102: [part 2] Fixed the JoinTranspose rule. ...................................................................... Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaJoinProjectTransposeRule.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaJoinProjectTransposeRule.java: http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaJoinProjectTransposeRule.java@60 PS6, Line 60: pushed > nit: this rule is 'pulling' above the join. 'Push' would imply pushing bel Done http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaJoinProjectTransposeRule.java@61 PS6, Line 61: * While the JoinProjectTranspose rule handles this, it has one flaw. If there is a > To clarify, suppose I have a plan: It's been my observation that the RelFieldTrimmer trims the fields with the Project underneath the join. This is important and relevant to the scan node so that we only select out the fields needed for the join. If there is a project above the join, it might be the case that the RelFieldTrimmer is removing the fields that the top level join doesn't need. But the fields are still needed for the lower level join and thus needed from the HdfsScanNode. While it may have some impact for the upper level join, it will not nearly have as much impact as retrieving the columns from the ScanNode which is why I felt it is more important to keep the lower level Project but not as important as leaving the Project in place between the Joins. We may want to look at this again in the future, however, to get a more accurate join optimization. http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaJoinProjectTransposeRule.java@93 PS6, Line 93: if (join.getJoinType() == JoinRelType.SEMI) { > For semi-joins, there's a different transpose rule in Calcite: core/src/ma I skipped over this for now because it caused problems here. That may be something to look at in the future, so I'll file a Jira for this to test and see if it can help. http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java: http://gerrit.cloudera.org:8080/#/c/23313/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/service/CalciteOptimizer.java@231 PS6, Line 231: // XXX: add comment about project > nit: remove the XXX annotation. Done -- To view, visit http://gerrit.cloudera.org:8080/23313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f62ec030fc8fbe36e6150bf96c9673c44b7da1b Gerrit-Change-Number: 23313 Gerrit-PatchSet: 6 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Sun, 07 Sep 2025 02:54:06 +0000 Gerrit-HasComments: Yes
