Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21516 )

Change subject: IMPALA-13077: Fix selectivity estimation for SEMI JOIN
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/21516/4/fe/src/main/java/org/apache/impala/planner/JoinNode.java@747
PS4, Line 747: lhsDivisor
Doesn't having 1 here (and in other formulas) mean that the selectivity will be 
>1 most of the times? That doesn't make sense for a semi join. I think that it 
would be clearer to skip the predicate in these cases.

An idea to simplify this is to move earlier to initialize variables like 
isAntiJoin, inputNdv, filterNdv and "continue" if inputNdv is -1.


http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test:

http://gerrit.cloudera.org:8080/#/c/21516/4/testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test@486
PS4, Line 486: where ss_sold_date_sk=(
             :   select min(d_date_sk) + 1000 from tpcds.date_dim)
This doesn't look a like a perfect test for me as it could be also fixed by 
setting the ndv of the scalar subquery. If such a fix goes in then it will no 
longer verify the current patch.



--
To view, visit http://gerrit.cloudera.org:8080/21516
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c799df535d764c3f87ededef1c48eaa103293a0
Gerrit-Change-Number: 21516
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Wed, 19 Jun 2024 16:44:38 +0000
Gerrit-HasComments: Yes

Reply via email to