Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/22873


Change subject: IMPALA-14014: Fix COMPUTE STATS with TABLESAMPLE clause
......................................................................

IMPALA-14014: Fix COMPUTE STATS with TABLESAMPLE clause

COMPUTE STATS with TABLESAMPLE clause did a full scan on Iceberg
tables since IMPALA-13737, because before this patch ComputeStatsStmt
used FeFsTable.Utils.getFilesSample() which only works correctly on
FS tables that have the file descriptors loaded. Since IMPALA-13737
the internal FS table of an Iceberg table doesn't have file descriptor
information, therefore FeFsTable.Utils.getFilesSample() returned an
empty map which turned off table sampling for COMPUTE STATS.

We did not have proper testing for COMPUTE STATS with table sampling
therefore we did not catch the regression.

This patch adds proper table sampling logic for Iceberg tables that
can be used for COMPUTE STATS.

Testing
 * added e2e tests

Change-Id: Ie59d5fc1374ab69209a74f2488bcb9a7d510b782
---
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M testdata/datasets/functional/functional_schema_template.sql
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-compute-stats-table-sampling.test
M tests/query_test/test_iceberg.py
9 files changed, 370 insertions(+), 53 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/22873/1
--
To view, visit http://gerrit.cloudera.org:8080/22873
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie59d5fc1374ab69209a74f2488bcb9a7d510b782
Gerrit-Change-Number: 22873
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to