Gabor Kaszab has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21869
Change subject: IMPALA-11265: Part2: Store Iceberg file descriptors in encoded format ...................................................................... IMPALA-11265: Part2: Store Iceberg file descriptors in encoded format The file descriptors in HdfsPartition are cached as byte arrays to keep the memory footprint low. They are transformed into actual FileDescriptor objects once queried. This patch changes IcebergContentFileStore to similarly use byte arrays as an internal representation for file descriptors. Note, file descriptors for Iceberg tables have 2 components: one is the same as in HdfsPartition and the other stores Iceberg specific file metadata in an addiitonal byte array. Measurements and observations: - I have a test table that has 110k data files. For this table the JVM memory usage in the catalogd got reduced from 80MB to 65MB. - Both HdfsPartition.FileDescriptor and IcebergContentFileStore use flatbuffers and in turn byte arrays to represent file descriptors and these byte arrays are shared between these 2 places. As a result there is no redundancy in storing the file descriptors both for the Iceberg and the Hdfs table. - There is no measurable difference in planning times with this patch. Change-Id: I9d7794df999bdaf118158eace26cea610f911c0a --- M fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java 6 files changed, 192 insertions(+), 100 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/21869/1 -- To view, visit http://gerrit.cloudera.org:8080/21869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9d7794df999bdaf118158eace26cea610f911c0a Gerrit-Change-Number: 21869 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab <[email protected]>
