Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/22339 )
Change subject: IMPALA-13609: Store Iceberg snapshot id for COMPUTE STATS ...................................................................... Patch Set 10: (10 comments) http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22339/3//COMMIT_MSG@41 PS3, Line 41: : Note that this change does not yet modify how Impala ch > I implemented ranges and removed the limit on the number of columns. Still, I think this can only happen in two cases: * Frequent COMPUTE STATS column-by-column on a frequently changing table * the user deliberately wants to mess up the table The first case is an anti pattern because * COMPUTE STATS shouldn't be frequent * COMPUTE STATS shouldn't be column-by-column * Iceberg tables shouldn't change too quickly We only need to deal with the second case if it can cause any harm to the Impala service. I think it doesn't matter if it causes harm to the target table, as the user already has ALTER privilege on the table (to run COMPUTE STATS), i.e. there aren't many things we can do to protect the table. Maybe we can produce a WARNING log if we see too many entries in the table property. http://gerrit.cloudera.org:8080/#/c/22339/10/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/22339/10/be/src/service/client-request-state.cc@1913 PS10, Line 1913: TExecRequest& exec_req = exec_request(); TExecRequest is created by the Frontend(.java). We could set the snapshot id there. This is something we already do for DML statements, see e.g. Frontend.addFinalizationParamsForIcebergDml(): iceFinalizeParams.setInitial_snapshot_id(iceTable.snapshotId()); http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@817 PS5, Line 817: } > Do you mean the same problem that is described in the commit message, i.e. Yes http://gerrit.cloudera.org:8080/#/c/22339/5/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@817 PS5, Line 817: } > The current solution is putting the snapshot id into the frontend part of t Added another suggestion in client-request-state.cc http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@790 PS10, Line 790: private static TreeMap<Long, Long> getComputeStatsIcebergSnapshotMap( Can we have unit tests for this method? http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@790 PS10, Line 790: getComputeStatsIcebergSnapshotMap Can we move this method to IcebergUtil? http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@831 PS10, Line 831: nit: too many spaces http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@856 PS10, Line 856: ComputeStatsIcebergSnapshotPropertyConverter Can we move this class to IcebergUtil? http://gerrit.cloudera.org:8080/#/c/22339/10/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@856 PS10, Line 856: private static class ComputeStatsIcebergSnapshotPropertyConverter { Could you please add unit tests for this class? http://gerrit.cloudera.org:8080/#/c/22339/10/testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test: http://gerrit.cloudera.org:8080/#/c/22339/10/testdata/workloads/functional-query/queries/QueryTest/iceberg-compute-stats.test@417 PS10, Line 417: .* Do we need the .* here? If it's for whitespaces, can we use \s* instead? -- To view, visit http://gerrit.cloudera.org:8080/22339 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id9998b84c4fd20d1cf5e97a34f3553832ec70ae7 Gerrit-Change-Number: 22339 Gerrit-PatchSet: 10 Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Mar 2025 14:41:52 +0000 Gerrit-HasComments: Yes