Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/23705


Change subject: IMPALA-13756: Fix Iceberg V2 count(*) optimization for complex 
queries
......................................................................

IMPALA-13756: Fix Iceberg V2 count(*) optimization for complex queries

In SelectStmt we enable count(*) optimization for Iceberg V2 queries
if certain criteria is met. We store this information in the query
context. Then IcebergScanPlanner applies count(*) optimization based
on the information in the query context.

This mechanism works for simple queries, but can turn on count(*)
optimization for all Iceberg V2 tables in complex queries if at
least one subquery enables count(*) optimization during analysis.

With this patch SelectStmt enables count(*) optimization via the
TableRef object which is later given the IcebergScanPlanner.

The count(*) expression also needs to be rewritten to show the proper
count of the table:

       AGGREGATE
       COUNT(*)
           |
       UNION ALL
      /        \
     /          \
    /            \
   SCAN all  ANTI JOIN
   datafiles  /      \
   without   /        \
   deletes  SCAN      SCAN
            datafiles deletes

            ||
          rewrite
            ||
            \/

  ArithmethicExpr: LHS + RHS
      /             \
     /               \
    /                 \
   record_count  AGGREGATE
   of all        COUNT(*)
   datafiles         |
   without       ANTI JOIN
   deletes      /         \
               /           \
               SCAN        SCAN
               datafiles   deletes

With this patch, instead of using ArithmethicExpr, we introduce
IcebergV2CountStarAdjuster so we know if the above optimization is
already applied.

Testing
 * e2e tests

Change-Id: I1940031298eb634aa82c3d32bbbf16bce8eaf874
---
M common/thrift/Query.thrift
A fe/src/main/java/org/apache/impala/analysis/IcebergV2CountStarAdjuster.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-count-star-optimization-in-complex-query.test
M tests/query_test/test_iceberg.py
8 files changed, 193 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/23705/1
--
To view, visit http://gerrit.cloudera.org:8080/23705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1940031298eb634aa82c3d32bbbf16bce8eaf874
Gerrit-Change-Number: 23705
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>

Reply via email to