Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22339 )

Change subject: IMPALA-13609: Store Iceberg snapshot id for COMPUTE STATS
......................................................................


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG@41
PS3, Line 41:
            : Note that this change does not yet modify how Impala ch
> I implemented ranges and removed the limit on the number of columns. Still,
I think this can only happen in two cases:

* Frequent COMPUTE STATS column-by-column on a frequently changing table
* the user deliberately wants to mess up the table

The first case is an anti pattern because

* COMPUTE STATS shouldn't be frequent
* COMPUTE STATS shouldn't be column-by-column
* Iceberg tables shouldn't change too quickly

We only need to deal with the second case if it can cause any harm to the 
Impala service. I think it doesn't matter if it causes harm to the target 
table, as the user already has ALTER privilege on the table (to run COMPUTE 
STATS), i.e. there aren't many things we can do to protect the table. Maybe we 
can produce a WARNING log if we see too many entries in the table property.


http://gerrit.cloudera.org:8080/#/c/22339/10/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/22339/10/be/src/service/client-request-state.cc@1913
PS10, Line 1913: TExecRequest& exec_req = exec_request();
TExecRequest is created by the Frontend(.java). We could set the snapshot id 
there. This is something we already do for DML statements, see e.g. 
Frontend.addFinalizationParamsForIcebergDml():

 iceFinalizeParams.setInitial_snapshot_id(iceTable.snapshotId());


http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@817
PS5, Line 817:       }
> Do you mean the same problem that is described in the commit message, i.e.
Yes


http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@817
PS5, Line 817:       }
> The current solution is putting the snapshot id into the frontend part of t
Added another suggestion in client-request-state.cc


http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@790
PS10, Line 790:   private static TreeMap<Long, Long> 
getComputeStatsIcebergSnapshotMap(
Can we have unit tests for this method?


http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@790
PS10, Line 790: getComputeStatsIcebergSnapshotMap
Can we move this method to IcebergUtil?


http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@831
PS10, Line 831:
nit: too many spaces


http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@856
PS10, Line 856: ComputeStatsIcebergSnapshotPropertyConverter
Can we move this class to IcebergUtil?


http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@856
PS10, Line 856:   private static class 
ComputeStatsIcebergSnapshotPropertyConverter {
Could you please add unit tests for this class?


http://gerrit.cloudera.org:8080/#/c/22339/10/testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test:

http://gerrit.cloudera.org:8080/#/c/22339/10/testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test@417
PS10, Line 417: .*
Do we need the .* here? If it's for whitespaces, can we use \s* instead?



--
To view, visit http://gerrit.cloudera.org:8080/22339
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id9998b84c4fd20d1cf5e97a34f3553832ec70ae7
Gerrit-Change-Number: 22339
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Mar 2025 14:41:52 +0000
Gerrit-HasComments: Yes

Reply via email to