Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21955 )

Change subject: IMPALA-13465: Trace TupleId further to reduce Agg cardinality
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21955/8/testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
File testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test:

http://gerrit.cloudera.org:8080/#/c/21955/8/testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test@1328
PS8, Line 1328: |  row-size=16B cardinality=600
On second thought, I don't like this extreme reduction.

numRows(tpch_parquet.customer) = 150000
ndv(c_custkey) = 25

cardinality(00:SCAN) = 6000 because of selectivity from predicate c_nationkey = 
16 (150000 / 25). We can't accurately measure selectivity of this predicate 
during planning. 600 output cardinality is another reduction from having: 
count(*) < 150000 (default to 0.1 selectivity if Planner can't estimate better).

If we do tuple-based reduction for this lone c_custkey expression, then we 
should do it for all expressions in case their scan cardinality is reduced by 
predicate too or not.

This is ExecSummary from real run:

    ExecSummary:
Operator                 #Hosts  #Inst   Avg Time   Max Time   #Rows  Est. 
#Rows   Peak Mem  Est. Peak Mem  Detail
------------------------------------------------------------------------------------------------------------------------------------
F03:ROOT                      1      1  209.867us  209.867us                    
    4.01 MB        4.00 MB 
06:EXCHANGE                   1      1  273.126us  273.126us   4.08K         
600   48.00 KB       21.09 KB  UNPARTITIONED
F02:EXCHANGE SENDER           1      1  622.052us  622.052us                    
   25.21 KB       80.00 KB                 
03:AGGREGATE                  1      1    0.000ns    0.000ns   4.08K         
600    5.03 MB       10.00 MB  FINALIZE              
02:HASH JOIN                  1      1   39.887ms   39.887ms  61.27K      
91.47K  104.05 MB       80.83 MB  INNER JOIN, PARTITIONED
|--05:EXCHANGE                1      1    5.047ms    5.047ms   1.50M       
1.50M    2.97 MB        5.75 MB  HASH(o_custkey)
|  F01:EXCHANGE SENDER        2      2   25.312ms   28.458ms                    
   68.27 KB       48.00 KB
|  01:SCAN HDFS               2      2    5.138ms    7.039ms   1.50M       
1.50M   10.38 MB       40.00 MB  tpch_parquet.orders
04:EXCHANGE                   1      1   31.329us   31.329us   4.08K       
6.00K   96.00 KB       72.59 KB  HASH(c_custkey)
F00:EXCHANGE SENDER           1      1  135.283us  135.283us                    
   33.63 KB       56.00 KB
00:SCAN HDFS                  1      1  110.972ms  110.972ms   4.08K       
6.00K    2.63 MB       48.00 MB  tpch_parquet.customer



--
To view, visit http://gerrit.cloudera.org:8080/21955
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11f59ccc469c24c1800abaad3774c56190306944
Gerrit-Change-Number: 21955
Gerrit-PatchSet: 8
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 24 Oct 2024 20:56:00 +0000
Gerrit-HasComments: Yes

Reply via email to