Hi Ingo, Thanks for the inputs.
I think converting `ANALYZE TABLE` to `SELECT` statement is more generic approach. Because query plan optimization is more generic, we can provide more optimization rules to optimize not only `SELECT` statement converted from `ANALYZE TABLE` but also the `SELECT` statement written by users. > JDBC connector can get a row count estimate without performing a > SELECT COUNT(1) To optimize such cases, we can implement a rule to push aggregate into table source. Currently, there is a similar rule: SupportsAggregatePushDown, which supports only pushing local aggregate into source now. Best, Godfrey Ingo Bürk <airbla...@apache.org> 于2022年6月10日周五 17:15写道: > > Hi Godfrey, > > compared to the solution proposed in the FLIP (using a SELECT > statement), I wonder if you have considered adding APIs to catalogs / > connectors to perform this task as an alternative? > I could imagine that for many connectors, statistics could be > implemented in a less expensive way by leveraging the underlying system > (e.g. a JDBC connector can get a row count estimate without performing a > SELECT COUNT(1)). > > > Best > Ingo > > > On 10.06.22 09:53, godfrey he wrote: > > Hi all, > > > > I would like to open a discussion on FLIP-240: Introduce "ANALYZE > > TABLE" Syntax. > > > > As FLIP-231 mentioned, statistics are one of the most important inputs > > to the optimizer. Accurate and complete statistics allows the > > optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common > > but effective approach to gather statistics, which is already > > introduced by many compute engines and databases. > > > > The main purpose of discussion is to introduce "ANALYZE TABLE" syntax > > for Flink sql. > > > > You can find more details in FLIP-240 document[1]. Looking forward to > > your feedback. > > > > [1] > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481 > > [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240 > > > > > > Best, > > Godfrey