[ 
https://issues.apache.org/jira/browse/IMPALA-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18020981#comment-18020981
 ] 

Riza Suminto commented on IMPALA-14447:
---------------------------------------

I think we can parallelize within StmtMetadataLoader.getMissingTables().

CatalogdMetaProvider.loadWithCaching() already has mitigation for concurrent 
loading, so multiple StmtMetadataLoader running concurrently should be fine.

> Metadata loading is not triggered in parallel in local catalog mode
> -------------------------------------------------------------------
>
>                 Key: IMPALA-14447
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14447
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Priority: Major
>
> When a query accesses multiple tables that are unloaded, metadata loading of 
> them is triggered sequentially in local catalog mode. The stacktrace of 
> coordinator thread:
> {noformat}
> "Thread-20 [LoadWithCaching for table metadata for default.part_900k_parq1]" 
> #112 prio=5 os_prio=0 tid=0x000000000aa08000 nid=0x27f7 runnable 
> [0x00007fcb9afe5000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.impala.service.FeSupport.NativeGetPartialCatalogObject(Native 
> Method)
>         at 
> org.apache.impala.service.FeSupport.GetPartialCatalogObject(FeSupport.java:472)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:463)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.access$100(CatalogdMetaProvider.java:209)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$4.call(CatalogdMetaProvider.java:815)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider$4.call(CatalogdMetaProvider.java:807)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadWithCaching(CatalogdMetaProvider.java:601)
>         at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadTable(CatalogdMetaProvider.java:803)
>         at 
> org.apache.impala.catalog.local.LocalTable.loadTableMetadata(LocalTable.java:164)
>         at 
> org.apache.impala.catalog.local.LocalTable.load(LocalTable.java:114)
>         at org.apache.impala.catalog.local.LocalDb.getTable(LocalDb.java:148)
>         at 
> org.apache.impala.analysis.StmtMetadataLoader.getMissingTables(StmtMetadataLoader.java:323)
>         at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:176)
>         at 
> org.apache.impala.analysis.StmtMetadataLoader.loadTables(StmtMetadataLoader.java:145)
>         at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2600)
>         at 
> org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:2295)
>         at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:2032)
>         at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:171){noformat}
> For an unloaded table, metadata loading is triggered by the first call on 
> LocalDb.getTable(). The metadata loading of the second unloaded table is 
> triggered after this is done.
> This is a performance regression comparing to the legacy catalog mode where 
> metadata loading on all tables accessed by a query are triggered in parallel.
> CC [~rizaon]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to