Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/23436 )
Change subject: IMPALA-14447: Parallelize table loading in getMissingTables() ...................................................................... Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/23436/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23436/1//COMMIT_MSG@11 PS1, Line 11: incur > nit: incur Done http://gerrit.cloudera.org:8080/#/c/23436/1/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/23436/1/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@331 PS1, Line 331: // Use parallel stream to speed up table loadings from CatalogD in case of > Tried this but it's not actually parallelized. There is a related discussio Ack. Seeing more parallelism after using ArrayList. I20250918 10:01:17.258389 1097707 Frontend.java:2435] 8942e75975bfa1dc:7fccacd200000000] Analyzing query: select count(*) from alltypes union select count(*) from alltypessmall union select count(*) from alltypestiny union select count(*) from alltypesagg db: functional I20250918 10:01:17.258642 1097707 Frontend.java:2471] 8942e75975bfa1dc:7fccacd200000000] The original executor group sets from executor membership snapshot: [TExecutorGroupSet(curr_num_executors:3, expected_num_executors:20, exec_group_name_prefix:)] I20250918 10:01:17.258844 1097707 Frontend.java:2491] 8942e75975bfa1dc:7fccacd200000000] A total of 1 executor group sets to be considered for auto-scaling: [TExecutorGroupSet(curr_num_executors:3, expected_num_executors:20, exec_group_name_prefix:, max_mem_limit:9223372036854775807, num_cores_per_executor:2147483647)] I20250918 10:01:17.259057 1097707 Frontend.java:2535] 8942e75975bfa1dc:7fccacd200000000] Consider executor group set: TExecutorGroupSet(curr_num_executors:3, expected_num_executors:20, exec_group_name_prefix:, max_mem_limit:9223372036854775807, num_cores_per_executor:2147483647) with assumption of 0 cores per node. I20250918 10:01:17.261566 1097707 CatalogdMetaProvider.java:640] 8942e75975bfa1dc:7fccacd200000000] Request for database list: miss I20250918 10:01:17.262383 1097764 CatalogdMetaProvider.java:640] Request for table list of database functional: piggy-backed miss I20250918 10:01:17.262591 1097763 CatalogdMetaProvider.java:640] Request for table list of database functional: piggy-backed miss I20250918 10:01:17.262732 1097707 CatalogdMetaProvider.java:640] 8942e75975bfa1dc:7fccacd200000000] Request for table list of database functional: miss I20250918 10:01:17.262842 1097766 CatalogdMetaProvider.java:640] Request for table list of database functional: piggy-backed miss I20250918 10:01:17.295322 1097764 CatalogdMetaProvider.java:640] Request for table metadata for functional.alltypesagg: miss I20250918 10:01:17.297480 1097764 CatalogdMetaProvider.java:951] Request for column stats of TableMetaRef functional.alltypesagg@4198: hit 0/ neg hit 0 / miss 14 I20250918 10:01:17.299315 1097764 CatalogdMetaProvider.java:640] Request for partition list for TableMetaRef functional.alltypesagg@4198: miss I20250918 10:01:17.301469 1097764 CatalogdMetaProvider.java:640] Request for null partition key value: miss I20250918 10:01:17.322903 1097763 CatalogdMetaProvider.java:640] Request for table metadata for functional.alltypessmall: miss I20250918 10:01:17.327049 1097707 CatalogdMetaProvider.java:640] 8942e75975bfa1dc:7fccacd200000000] Request for table metadata for functional.alltypestiny: miss I20250918 10:01:17.327354 1097707 CatalogdMetaProvider.java:895] 8942e75975bfa1dc:7fccacd200000000] Invalidated stale TABLE_LIST for DB functional after loading table alltypestiny: hasStaleType=false, hasStaleComment=true I20250918 10:01:17.328145 1097763 CatalogdMetaProvider.java:951] Request for column stats of TableMetaRef functional.alltypessmall@4199: hit 0/ neg hit 0 / miss 13 I20250918 10:01:17.328971 1097763 CatalogdMetaProvider.java:640] Request for partition list for TableMetaRef functional.alltypessmall@4199: miss I20250918 10:01:17.330861 1097707 CatalogdMetaProvider.java:951] 8942e75975bfa1dc:7fccacd200000000] Request for column stats of TableMetaRef functional.alltypestiny@4200: hit 0/ neg hit 0 / miss 13 I20250918 10:01:17.331364 1097707 CatalogdMetaProvider.java:640] 8942e75975bfa1dc:7fccacd200000000] Request for partition list for TableMetaRef functional.alltypestiny@4200: miss I20250918 10:01:17.347568 1097766 CatalogdMetaProvider.java:640] Request for table metadata for functional.alltypes: miss I20250918 10:01:17.348773 1097766 CatalogdMetaProvider.java:951] Request for column stats of TableMetaRef functional.alltypes@4201: hit 0/ neg hit 0 / miss 13 I20250918 10:01:17.349702 1097766 CatalogdMetaProvider.java:640] Request for partition list for TableMetaRef functional.alltypes@4201: miss I20250918 10:01:17.351891 1097707 AnalysisContext.java:512] 8942e75975bfa1dc:7fccacd200000000] Analysis took 1 ms I20250918 10:01:17.351955 1097707 BaseAuthorizationChecker.java:114] 8942e75975bfa1dc:7fccacd200000000] Authorization check took 0 ms I20250918 10:01:17.351975 1097707 Frontend.java:2984] 8942e75975bfa1dc:7fccacd200000000] Analysis and authorization finished. Catalogd logs also shows parallel load I20250918 10:01:17.262768 1097771 TableLoadingMgr.java:74] Loading metadata for table: functional.alltypesagg I20250918 10:01:17.262812 1098601 TableLoader.java:81] Loading metadata for: functional.alltypesagg (needed by coordinator) I20250918 10:01:17.263163 1097771 TableLoadingMgr.java:76] Remaining items in queue: 0. Loads in progress: 1 I20250918 10:01:17.263268 1098602 TableLoader.java:81] Loading metadata for: functional.alltypes (needed by coordinator) I20250918 10:01:17.263495 1097772 TableLoadingMgr.java:74] Loading metadata for table: functional.alltypes I20250918 10:01:17.263535 1097772 TableLoadingMgr.java:76] Remaining items in queue: 0. Loads in progress: 4 I20250918 10:01:17.263582 1097767 TableLoadingMgr.java:74] 3e4d658776240c2d:cc23c67200000000] Loading metadata for table: functional.alltypessmall I20250918 10:01:17.263619 1098603 TableLoader.java:81] Loading metadata for: functional.alltypessmall (needed by coordinator) I20250918 10:01:17.263808 1097767 TableLoadingMgr.java:76] 3e4d658776240c2d:cc23c67200000000] Remaining items in queue: 0. Loads in progress: 4 I20250918 10:01:17.263988 1098604 TableLoader.java:81] Loading metadata for: functional.alltypestiny (needed by coordinator) I20250918 10:01:17.264189 1097768 TableLoadingMgr.java:74] Loading metadata for table: functional.alltypestiny I20250918 10:01:17.264432 1097768 TableLoadingMgr.java:76] Remaining items in queue: 0. Loads in progress: 4 http://gerrit.cloudera.org:8080/#/c/23436/1/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@336 PS1, Line 336: .map(tblName -> { > The thread-safetiness mostly offered by the underlying MetaProvider. Patch set 2 made LocalCatalog.loadDbs() more robust. http://gerrit.cloudera.org:8080/#/c/23436/1/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@339 PS1, Line 339: } > Logs of the other threads are missing the query id. I don't think we can do anything about it. The query id prepending happens in the C++ side using Impala::ThreadDebugInfo() https://gerrit.cloudera.org/c/12129/ http://gerrit.cloudera.org:8080/#/c/23436/1/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@342 PS1, Line 342: dbs_.add(tblName.getDb()); > nit: Can we reduce some idention? E.g. Done -- To view, visit http://gerrit.cloudera.org:8080/23436 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97a5165844ae846b28338d62e93a20121488d79f Gerrit-Change-Number: 23436 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Thu, 18 Sep 2025 18:00:34 +0000 Gerrit-HasComments: Yes
