[ https://issues.apache.org/jira/browse/FLINK-20416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17249552#comment-17249552 ]
Sebastian Liu commented on FLINK-20416: --------------------------------------- Hi [~jark], [~lirui], Thanks for your suggestions, I've drafted a design doc about the catalog cache, looking forward for your comment and feedback. [https://docs.google.com/document/d/1oL8HUpv2WaF6OkFvbH5iefXkOJB__Dal_bYsIZJA_Gk/edit?usp=sharing] > Need a cached catalog for batch SQL job > --------------------------------------- > > Key: FLINK-20416 > URL: https://issues.apache.org/jira/browse/FLINK-20416 > Project: Flink > Issue Type: Improvement > Components: Connectors / Common, Connectors / Hive, Table SQL / API, > Table SQL / Planner > Reporter: Sebastian Liu > Priority: Major > Labels: pull-request-available > > For OLAP scenarios, There are usually some analytical queries which running > time is relatively short. These queries are also sensitive to latency. In the > current Blink sql processing, parse/validate/optimize stages are all need > meta data from catalog API. But each request to the catalog requires re-run > of the underlying meta query. > > We may need a cached catalog which can cache the table schema and statistic > info to avoid unnecessary repeated meta requests. > I have submitted a related PR for adding a genetic cached catalog, which can > delegate other implementations of {{AbstractCatalog. }} > {{[https://github.com/apache/flink/pull/14260]}} -- This message was sent by Atlassian Jira (v8.3.4#803005)