Hello Zoltan Borok-Nagy, Noemi Pap-Takacs, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/22216
to look at the new patch set (#3).
Change subject: IMPALA-11265: Part3: Remove redundancy of file descriptors in
coordinators
......................................................................
IMPALA-11265: Part3: Remove redundancy of file descriptors in coordinators
Coordinator in local catalog mode caches objects like TPartialTableInfo
and PartitionMetadatImpl etc. These objects contain metadata for file
descriptors where a file descriptor contains a 'general' and an Iceberg
specific part.
Currently there is redundant caching of the 'general' part of the file
descriptor metadata because PartititonMetadataImpl and TPartialTableInfo
both contain this part of the metadata while the latter also contains
the Iceberg specific part too.
To avoid this redundancy the IcebergContentFileStore will not convert
the 'general' part of a file descriptor to thrift and will let the
HdfsTable in the Iceberg table to send this information. With this the
TPartialTableInfo cached on the coordinator side will be smaller and the
original file descriptor will be assembled when creating the
LocalIcebergTable object.
Measurements:
- My test table contains 1000 snapshots and 110k files.
- Before this change coordinator cached 76% more data in JVM memory
for an Iceberg table compared to a Hive table with the same
structure.
- This change brings down the difference to approx. 30%
Change-Id: I37c245d907baf1e46cb39d046c544bf50b37d581
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
6 files changed, 98 insertions(+), 50 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/22216/3
--
To view, visit http://gerrit.cloudera.org:8080/22216
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I37c245d907baf1e46cb39d046c544bf50b37d581
Gerrit-Change-Number: 22216
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>