Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/23113 )
Change subject: IMPALA-13267: Display number of partitions for Iceberg tables ...................................................................... Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@412 PS6, Line 412: bergPartition, Integer> conv We can avoid calling partitionMap.get() in the loop body by iterating over partitionMap.entrySet: for (Map.Entry<TIcebergPartition, Integer> entry : partitionMap.entrySet()) { partitionList.set(entry.getValue(), entry.getKey()); } http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@233 PS6, Line 233: selectedPartitions.add( Using BitSet would be much more memory efficient, and much faster as well. http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@235 PS6, Line 235: } I don't think we need this Preconditions check in this tight loop. ((IcebergFileDescriptor)fd) will throw ClassCastException anyway if fd is not an IcebergFileDescriptor. http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java: http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@857 PS6, Line 857: if (partitionId == null) { : // If the partition is not in any of the maps abov Since fileStore is shared between concurrent queries, it's possible that parallel time-travel queries will set the same partitionId for different partitions. http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test File testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test: http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test@29 PS6, Line 29: 8 Why did row-size change? I guess it is due to an independent change, maybe https://gerrit.cloudera.org/#/c/22761/ http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test: http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test@28 PS6, Line 28: Just an idea, feel free to discuss it separately, but it might make sense to output "Iceberg partitions" instead of "HDFS partitions" http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test@71 PS6, Line 71: What happens if we issue REFRESH instead of INVALIDATE METADATA? To fix it, we'll need to recreate the partitions map during table (re)loading with the help of the old partitions map. We cannot just re-use and grow the old partitions map unfortunately. -- To view, visit http://gerrit.cloudera.org:8080/23113 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb2f654bc6c9bdf9cfafc27b38b5ca2f7b6b4872 Gerrit-Change-Number: 23113 Gerrit-PatchSet: 4 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 12 Aug 2025 16:37:19 +0000 Gerrit-HasComments: Yes
