Gabor Kaszab has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22157
Change subject: IMPALA-11265: Part3: Group the Iceberg file descriptors by partition ...................................................................... IMPALA-11265: Part3: Group the Iceberg file descriptors by partition Originally IcebergContentFileStore organizes the file descriptors into a map where the keys are the file path hashes and the values are the file descriptors. This results in a very fast lookup for a particular file descriptor, however, the memory usage of such a structure is very greedy. One example: Test table has 16400 partitions and 110k data files. The catalogd JVM usage of the table is 61,8MB, where the file path hash strings took 11,44MB of JVM memory, 18,5% of the memory usage of the whole table. This patch enhances the catalogd JVM memory usage of Iceberg tables with 2 enhancements: 1: Restructure the IcebergContentFileStore to have a mapping by partition to a list of file descriptors. Note, HdfsTable also holds the per partition file descriptors in a list. 2: Instead of using the output of HashCode.toString() as a key for the mapping, use the HashCode object itself. With this, there is a sacrifice on the file descriptor lookup front, while the JVM memory usage of an Iceberg table is reduced. Measurements: Test table has 16400 partitions and 110k data files. - Memory usage The JVM memory size of this table is reduced from 61,8MB to 48MB. Compared to a Hive table with same characteristics the memory size difference is reduced from 1,55X to 1,2X. - Table loading times The time required to do a full metadata load of an Iceberg table is the same with this patch. - Query planning time #1 Wrote a query that filters on a partition column where all 110k files survives the filter and has to be looked up in IcebergContentFileStore. I used a 'WHERE id > 0' predicate where the table is partitioned by 'id' and all the values are greater than zero. This query can be considered a worst case scenario for this table. With such a query the planning times are longer by ~40%, but still negligible in terms of the full query runtimes as the planning time increased from an average 0,8s to 1,14s. I think this is a regression we can live with. - Query planning time #2 In another query I used a predicate on a non-partition column. In this case the Iceberg lib doesn't pre-filter the file descriptors and Impala simply gets all the file descriptors from the ContentFileStore. The average planning times in fact reduced with this patch from 0,44s to 0,23s by ~47%. I think the reason is that the file descriptors are already arranged in lists and there is only a slight overhead when creating an aggregated list to return all of them. - Query planning time #3 Test #1 was a worst case scenario where Impala did a lookup in cache for all the 110k file descriptors. In test #3 I used a predicate that filtered out 3/4 of the file descriptors. Here, there is still a 35% of degradation in planning times, however, it is still so fast that I'd consider this negligible. The average planning times changed from 0,19s to 0,26s. Change-Id: I276d839335c0aa39fa31a06ce08588a91a313768 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M fe/src/test/java/org/apache/impala/util/IcebergUtilTest.java 6 files changed, 190 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/22157/1 -- To view, visit http://gerrit.cloudera.org:8080/22157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I276d839335c0aa39fa31a06ce08588a91a313768 Gerrit-Change-Number: 22157 Gerrit-PatchSet: 1 Gerrit-Owner: Gabor Kaszab <[email protected]>
