Hello Andrew Sherman, Daniel Becker, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/21869

to look at the new patch set (#3).

Change subject: IMPALA-11265: Part2: Store Iceberg file descriptors in encoded 
format
......................................................................

IMPALA-11265: Part2: Store Iceberg file descriptors in encoded format

The file descriptors in HdfsPartition are cached as byte arrays to keep
the memory footprint low. They are transformed into actual
FileDescriptor objects once queried.
This patch changes IcebergContentFileStore to similarly use byte arrays
as an internal representation for file descriptors. Note, file
descriptors for Iceberg tables have 2 components: one is the same as in
HdfsPartition and the other stores Iceberg specific file metadata in an
additional byte array.

Measurements and observations:
 - I have a test table that has 110k data files. For this table the JVM
   memory usage in the catalogd got reduced from 80MB to 65MB.
 - Both HdfsPartition.FileDescriptor and IcebergContentFileStore use
   flatbuffers and in turn byte arrays to represent file descriptors and
   these byte arrays are shared between these 2 places. As a result
   there is no redundancy in storing the file descriptors both for the
   Iceberg and the Hdfs table.
 - There is no measurable difference in planning times with this patch.

Change-Id: I9d7794df999bdaf118158eace26cea610f911c0a
---
M fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
A fe/src/test/java/org/apache/impala/catalog/IcebergContentFileStoreTest.java
8 files changed, 311 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/21869/3
--
To view, visit http://gerrit.cloudera.org:8080/21869
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d7794df999bdaf118158eace26cea610f911c0a
Gerrit-Change-Number: 21869
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to