Zoltan Borok-Nagy has uploaded this change for review. (
http://gerrit.cloudera.org:8080/23705
Change subject: IMPALA-13756: Fix Iceberg V2 count(*) optimization for complex
queries
......................................................................
IMPALA-13756: Fix Iceberg V2 count(*) optimization for complex queries
In SelectStmt we enable count(*) optimization for Iceberg V2 queries
if certain criteria is met. We store this information in the query
context. Then IcebergScanPlanner applies count(*) optimization based
on the information in the query context.
This mechanism works for simple queries, but can turn on count(*)
optimization for all Iceberg V2 tables in complex queries if at
least one subquery enables count(*) optimization during analysis.
With this patch SelectStmt enables count(*) optimization via the
TableRef object which is later given the IcebergScanPlanner.
The count(*) expression also needs to be rewritten to show the proper
count of the table:
AGGREGATE
COUNT(*)
|
UNION ALL
/ \
/ \
/ \
SCAN all ANTI JOIN
datafiles / \
without / \
deletes SCAN SCAN
datafiles deletes
||
rewrite
||
\/
ArithmethicExpr: LHS + RHS
/ \
/ \
/ \
record_count AGGREGATE
of all COUNT(*)
datafiles |
without ANTI JOIN
deletes / \
/ \
SCAN SCAN
datafiles deletes
With this patch, instead of using ArithmethicExpr, we introduce
IcebergV2CountStarAdjuster so we know if the above optimization is
already applied.
Testing
* e2e tests
Change-Id: I1940031298eb634aa82c3d32bbbf16bce8eaf874
---
M common/thrift/Query.thrift
A fe/src/main/java/org/apache/impala/analysis/IcebergV2CountStarAdjuster.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java
A
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-count-star-optimization-in-complex-query.test
M tests/query_test/test_iceberg.py
8 files changed, 193 insertions(+), 11 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/23705/1
--
To view, visit http://gerrit.cloudera.org:8080/23705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1940031298eb634aa82c3d32bbbf16bce8eaf874
Gerrit-Change-Number: 23705
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>