Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/20866 )
Change subject: IMPALA-12412: Support partition evolution in OPTIMIZE statement ...................................................................... Patch Set 5: (9 comments) Left some additional comments, but the code looks great! http://gerrit.cloudera.org:8080/#/c/20866/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20866/5//COMMIT_MSG@13 PS5, Line 13: . and schema http://gerrit.cloudera.org:8080/#/c/20866/5//COMMIT_MSG@24 PS5, Line 24: instances query fragment instances http://gerrit.cloudera.org:8080/#/c/20866/5//COMMIT_MSG@25 PS5, Line 25: at least that many files It's not necessarily true for partitioned tables, so maybe we can just that we might write too many files for unpartitioned tables. http://gerrit.cloudera.org:8080/#/c/20866/5/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java File fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java: http://gerrit.cloudera.org:8080/#/c/20866/5/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@41 PS5, Line 41: . and schema http://gerrit.cloudera.org:8080/#/c/20866/5/fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java@58 PS5, Line 58: column nit: columns / or maybe "for each column" http://gerrit.cloudera.org:8080/#/c/20866/4/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20866/4/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@525 PS4, Line 525: DataFile dataFile = createDataFile(feIcebergTable, buf); > RewriteFiles does not have validateNoConflictingDeletes(). How can we check This is checked by Iceberg: https://github.com/apache/iceberg/blob/e61b0e51e151fb64cb5e39b49ea47c0809c3f84a/core/src/main/java/org/apache/iceberg/BaseRewriteFiles.java#L140 http://gerrit.cloudera.org:8080/#/c/20866/5/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/20866/5/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java@520 PS5, Line 520: throw new ImpalaRuntimeException(e.getMessage(), e); : } nit: indentation is off http://gerrit.cloudera.org:8080/#/c/20866/5/testdata/workloads/functional-query/queries/QueryTest/iceberg-optimize.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-optimize.test: http://gerrit.cloudera.org:8080/#/c/20866/5/testdata/workloads/functional-query/queries/QueryTest/iceberg-optimize.test@3 PS5, Line 3: CREATE TABLE ice_optimize (int_col int, string_col string, bool_col boolean) Please add test that invoking OPTIMIZE on empty table returns a reasonable error message. Currently it returns "ERROR: IllegalArgumentException: Files to delete cannot be empty". I think this message is kinda' OK, but we should make sure that following changes don't raise something worse, e.g. NullPointerException etc. http://gerrit.cloudera.org:8080/#/c/20866/5/testdata/workloads/functional-query/queries/QueryTest/iceberg-optimize.test@62 PS5, Line 62: OPTIMIZE TABLE ice_optimize; How about setting MAX_FS_WRITERS to 1 for this? -- To view, visit http://gerrit.cloudera.org:8080/20866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65a0c8529d274afff38ccd582f1b8a857716b1b5 Gerrit-Change-Number: 20866 Gerrit-PatchSet: 5 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 07 Feb 2024 14:42:13 +0000 Gerrit-HasComments: Yes
