[ 
https://issues.apache.org/jira/browse/IMPALA-15019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-15019:
------------------------------------
    Attachment: tpcds-q4-calcite-plan.txt

> Calcite planner has higher memory estimation
> --------------------------------------------
>
>                 Key: IMPALA-15019
>                 URL: https://issues.apache.org/jira/browse/IMPALA-15019
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Assignee: Steve Carlin
>            Priority: Major
>         Attachments: row-size-comparison.txt, tpcds-q4-calcite-plan.txt, 
> tpcds-q4-original-plan.txt
>
>
> Comparing the EXPLAIN outputs between the original planner and 
> calcite-planner, it seems the calcite planner always uses a larger row-size, 
> which might result in higher memory estimation.
> For instance, for the following query:
> {code:sql}
> EXPLAIN SELECT count(*) FROM functional.alltypes
>  WHERE year=2009 AND int_col=1 AND string_col='1';{code}
> The original planner uses row-size=17B in the scan node, which the 
> calcite-planner uses row-size=21B.
> Original planner:
> {noformat}
> +-------------------------------------------------------------+
> | Explain String                                              |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 |
> | Per-Host Resource Estimates: Memory=80MB                    |
> | Codegen disabled by planner                                 |
> |                                                             |
> | PLAN-ROOT SINK                                              |
> | |                                                           |
> | 03:AGGREGATE [FINALIZE]                                     |
> | |  output: count:merge(*)                                   |
> | |  row-size=8B cardinality=1                                |
> | |                                                           |
> | 02:EXCHANGE [UNPARTITIONED]                                 |
> | |                                                           |
> | 01:AGGREGATE                                                |
> | |  output: count(*)                                         |
> | |  row-size=8B cardinality=3                                |
> | |                                                           |
> | 00:SCAN HDFS [functional.alltypes]                          |
> |    partition predicates: `year` = 2009                      |
> |    HDFS partitions=12/24 files=12 size=238.68KB             |
> |    predicates: int_col = 1, string_col = '1'                |
> |    row-size=17B cardinality=115                             |
> +-------------------------------------------------------------+{noformat}
> Calcite-planner:
> {noformat}
> +--------------------------------------------------------------------------------------+
> | Explain String                                                              
>          |
> +--------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3                 
>          |
> | Per-Host Resource Estimates: Memory=80MB                                    
>          |
> | Codegen disabled by planner                                                 
>          |
> |                                                                             
>          |
> | PLAN-ROOT SINK                                                              
>          |
> | |                                                                           
>          |
> | 03:AGGREGATE [FINALIZE]                                                     
>          |
> | |  output: count:merge()                                                    
>          |
> | |  row-size=8B cardinality=1                                                
>          |
> | |                                                                           
>          |
> | 02:EXCHANGE [UNPARTITIONED]                                                 
>          |
> | |                                                                           
>          |
> | 01:AGGREGATE                                                                
>          |
> | |  output: count()                                                          
>          |
> | |  row-size=8B cardinality=3                                                
>          |
> | |                                                                           
>          |
> | 00:SCAN HDFS [functional.alltypes]                                          
>          |
> |    partition predicates: functional.alltypes.year = 2009                    
>          |
> |    HDFS partitions=12/24 files=12 size=238.68KB                             
>          |
> |    predicates: functional.alltypes.int_col = 1, 
> functional.alltypes.string_col = '1' |
> |    row-size=21B cardinality=115                                             
>          |
> +--------------------------------------------------------------------------------------+{noformat}
> Also compared TPCDS-Q4 as a more complex example, the original planner has 
> lower memory requirement:
> {noformat}
> Max Per-Host Resource Reservation: Memory=511.00MB Threads=50
> Per-Host Resource Estimates: Memory=2.57GB{noformat}
> The calcite-planner has higher memory:
> {noformat}
> Max Per-Host Resource Reservation: Memory=539.88MB Threads=50
> Per-Host Resource Estimates: Memory=2.68GB{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to