Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23113 )

Change subject: IMPALA-13267: Display number of partitions for Iceberg tables
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java:

http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@412
PS6, Line 412: bergPartition, Integer> conv
We can avoid calling partitionMap.get() in the loop body by iterating over 
partitionMap.entrySet:

    for (Map.Entry<TIcebergPartition, Integer> entry : partitionMap.entrySet()) 
{
      partitionList.set(entry.getValue(), entry.getKey());
    }


http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@233
PS6, Line 233:       selectedPartitions.add(
Using BitSet would be much more memory efficient, and much faster as well.


http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@235
PS6, Line 235:     }
I don't think we need this Preconditions check in this tight loop. 
((IcebergFileDescriptor)fd) will throw ClassCastException anyway if fd is not 
an IcebergFileDescriptor.


http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/23113/6/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@857
PS6, Line 857:       if (partitionId == null) {
             :         // If the partition is not in any of the maps abov
Since fileStore is shared between concurrent queries, it's possible that 
parallel time-travel queries will set the same partitionId for different 
partitions.


http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test:

http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-merge-insert-only.test@29
PS6, Line 29: 8
Why did row-size change? I guess it is due to an independent change, maybe 
https://gerrit.cloudera.org/#/c/22761/


http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test:

http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test@28
PS6, Line 28:
Just an idea, feel free to discuss it separately, but it might make sense to 
output "Iceberg partitions" instead of "HDFS partitions"


http://gerrit.cloudera.org:8080/#/c/23113/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitions.test@71
PS6, Line 71:
What happens if we issue REFRESH instead of INVALIDATE METADATA?

To fix it, we'll need to recreate the partitions map during table (re)loading 
with the help of the old partitions map. We cannot just re-use and grow the old 
partitions map unfortunately.



--
To view, visit http://gerrit.cloudera.org:8080/23113
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb2f654bc6c9bdf9cfafc27b38b5ca2f7b6b4872
Gerrit-Change-Number: 23113
Gerrit-PatchSet: 4
Gerrit-Owner: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 12 Aug 2025 16:37:19 +0000
Gerrit-HasComments: Yes

Reply via email to