Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/22559 )
Change subject: IMPALA-11402: Add limit on files fetched by a single getPartialCatalogObject request ...................................................................... Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/22559/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/22559/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2360 PS9, Line 2360: full_name_); Updated the warning, e.g. "Too many partitions requested by the query: 1824. Collected 1000 files of 1000 partitions of table tpcds.store_sales. Coordinator should fetch the remaining partitions in another request. This impacts metadata performance. Consider compacting files to improve it." Please check if it's clear now. > It might be helpful to also distinguish the case of many files per partition > vs many partitions since the first case can be addressed without changing the > partitioning. It'd be better to warn this when loading the metadata of the table https://github.com/apache/impala/blob/f98b697c7b37e18cb1101b62243974e42f72b9f4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1312 We have the total number of files and partitions there. Here we just collect partitions and files based on coordinator's request. Some partitions might not be used by the query so are not requested. There is a WIP JIRA that we can use to add such a warning: IMPALA-13122 -- To view, visit http://gerrit.cloudera.org:8080/22559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb13fec20de5a17e7fc33613ca5cdebb9ac1a1e5 Gerrit-Change-Number: 22559 Gerrit-PatchSet: 13 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Comment-Date: Wed, 09 Apr 2025 03:54:26 +0000 Gerrit-HasComments: Yes