Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22559 )

Change subject: IMPALA-11402: Add limit on files fetched by a single 
getPartialCatalogObject request
......................................................................


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/22559/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/22559/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2360
PS9, Line 2360:             full_name_);
Updated the warning, e.g. "Too many partitions requested by the query: 1824. 
Collected 1000 files of 1000 partitions of table tpcds.store_sales. Coordinator 
should fetch the remaining partitions in another request. This impacts metadata 
performance. Consider compacting files to improve it."

Please check if it's clear now.

> It might be helpful to also distinguish the case of many files per partition 
> vs many partitions since the first case can be addressed without changing the 
> partitioning.

It'd be better to warn this when loading the metadata of the table
https://github.com/apache/impala/blob/f98b697c7b37e18cb1101b62243974e42f72b9f4/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1312
We have the total number of files and partitions there. Here we just collect 
partitions and files based on coordinator's request. Some partitions might not 
be used by the query so are not requested.
There is a WIP JIRA that we can use to add such a warning: IMPALA-13122



--
To view, visit http://gerrit.cloudera.org:8080/22559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibb13fec20de5a17e7fc33613ca5cdebb9ac1a1e5
Gerrit-Change-Number: 22559
Gerrit-PatchSet: 13
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Comment-Date: Wed, 09 Apr 2025 03:54:26 +0000
Gerrit-HasComments: Yes

Reply via email to