Gabor Kaszab has uploaded this change for review. (
http://gerrit.cloudera.org:8080/22216
Change subject: IMPALA-11265: Part4: Remove redundancy of file descriptors in
coordinators
......................................................................
IMPALA-11265: Part4: Remove redundancy of file descriptors in coordinators
Coordinator in locatalog mode caches objects like TPartialTableInfo and
PartitionMetadatImpl etc. These objects contain metadata for file
descriptors where a file descriptor contains a 'general' and an Iceberg
specific part.
Currently there is redundant caching of the 'general' part of the file
descriptor metadata because PartititonMetadataImpl and TPartialTableInfo
both contain this part of the metadata while the latter also contains
the Iceberg specific part too.
To avoid this redundancy the IcebergContentFileStore will not convert
the 'general' part of a file descriptor to thrift and will let the
HdfsTable in the Iceberg table to send this information. With this the
TPartialTableInfo cached on the coordinator side will be smaller and the
original file descriptor will be assembled when creating the
LocalIcebergTable object.
Measurements:
- My test table contains 1000 snapshots and 110k files.
- Before this change coordinator cached 76% more data in JVM memory
for an Iceberg table compared to a Hive table with the same
structure.
- This changes brings down the difference to approx. 30%
Change-Id: I37c245d907baf1e46cb39d046c544bf50b37d581
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java
7 files changed, 122 insertions(+), 62 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/22216/1
--
To view, visit http://gerrit.cloudera.org:8080/22216
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I37c245d907baf1e46cb39d046c544bf50b37d581
Gerrit-Change-Number: 22216
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab <[email protected]>