Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/21516 )
Change subject: IMPALA-13077: Fix selectivity estimation for SEMI JOIN ...................................................................... Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@747 PS4, Line 747: lhsDivisor Doesn't having 1 here (and in other formulas) mean that the selectivity will be >1 most of the times? That doesn't make sense for a semi join. I think that it would be clearer to skip the predicate in these cases. An idea to simplify this is to move earlier to initialize variables like isAntiJoin, inputNdv, filterNdv and "continue" if inputNdv is -1. http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test File testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test: http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test@486 PS4, Line 486: where ss_sold_date_sk=( : select min(d_date_sk) + 1000 from tpcds.date_dim) This doesn't look a like a perfect test for me as it could be also fixed by setting the ndv of the scalar subquery. If such a fix goes in then it will no longer verify the current patch. -- To view, visit http://gerrit.cloudera.org:8080/21516 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c799df535d764c3f87ededef1c48eaa103293a0 Gerrit-Change-Number: 21516 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Wed, 19 Jun 2024 16:44:38 +0000 Gerrit-HasComments: Yes
