Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/22458 )
Change subject: IMPALA-13737: Directly load file metadata via IcebergFileMetadataLoader ...................................................................... IMPALA-13737: Directly load file metadata via IcebergFileMetadataLoader Currently we let HdfsTable to drive file metadata loading of Iceberg tables. To have better control over file loading, IcebergTable should use IcebergFileMetadataLoader directly. The underlying HdfsTable can be empty, which will make it easier to remove this dependency completely. Also, it solves the de-duplication of file descriptors in Local Catalog mode. This patch also clarifies the responsibilities of IcebergFileMetadataLoader and IcebergContentFileStore. The former is in charge of loading the file descriptors and decorating them with Iceberg metadata. The latter is only responsible for grouping and storing them in an efficient manner. This patch removes the dependency of IcebergContentFileStore on FeIcebergTable which will make the REST Catalog implementation cleaner. Measurements (Thanks to Gabor Kaszab for the numbers) As mentioned above, this patch de-duplicates the file descriptors in local catalog mode. I.e. it greatly reduces the memory footprint (IMPALA-11265) in the Coordinator when local catalog is being used. The measured table had 110k files, 16400 partitions, 1000 manifests, 1000 snapshots. The memory footprint: Before this patch: 107MB After this patch: 74MB Testing: * no new functionalities added, existing tests should work Change-Id: Iaf7e23ec21b65036b47edadcb4cbe4b64be3baee Reviewed-on: http://gerrit.cloudera.org:8080/22458 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java M fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java 11 files changed, 148 insertions(+), 170 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/22458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iaf7e23ec21b65036b47edadcb4cbe4b64be3baee Gerrit-Change-Number: 22458 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
