Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/22559 )
Change subject: IMPALA-11402: Add limit on files fetched by a single getPartialCatalogObject request ...................................................................... Patch Set 17: (3 comments) Thanks Quanlong. I've gone through the patch again and have some questions. Also, what happens if the table is modified between two (or more) RPCs? E.g. we write files into all partitions, and the partitions returned in the first TGetPartialCatalogObjectResponse RPC do not contain the new files, but the second batch already contains new files? Is this possible? Is it a new kind of problem or did we already have it in other places? http://gerrit.cloudera.org:8080/#/c/22559/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/22559/17/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2356 PS17, Line 2356: Nit: could add "startup flag 'catalog_partial...'". http://gerrit.cloudera.org:8080/#/c/22559/17/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/22559/17/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1109 PS17, Line 1109: for (int i = numFetchedParts; i < ids.size(); i++) { Would it make sense to only request as many partitions in the request as the value of the limit? Or do we not know the limit here? (We would possibly also need to move the warning logs here). http://gerrit.cloudera.org:8080/#/c/22559/17/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/22559/17/tests/custom_cluster/test_local_catalog.py@719 PS17, Line 719: err = ("Too many files to collect in table {0} partition year=2009/month=1: 2. " Is it logged for each partition here? Would it make sense to all or at least multiple partitions? -- To view, visit http://gerrit.cloudera.org:8080/22559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibb13fec20de5a17e7fc33613ca5cdebb9ac1a1e5 Gerrit-Change-Number: 22559 Gerrit-PatchSet: 17 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Wed, 16 Apr 2025 13:41:53 +0000 Gerrit-HasComments: Yes