Hi Ingo,

Thanks for the inputs.

I think converting `ANALYZE TABLE` to `SELECT` statement is
more generic approach. Because query plan optimization is more generic,
 we can provide more optimization rules to optimize not only `SELECT` statement
converted from `ANALYZE TABLE` but also the `SELECT` statement written by users.

> JDBC connector can get a row count estimate without performing a
> SELECT COUNT(1)
To optimize such cases, we can implement a rule to push aggregate into
table source.
Currently, there is a similar rule: SupportsAggregatePushDown, which
supports only pushing
local aggregate into source now.


Best,
Godfrey

Ingo Bürk <airbla...@apache.org> 于2022年6月10日周五 17:15写道:
>
> Hi Godfrey,
>
> compared to the solution proposed in the FLIP (using a SELECT
> statement), I wonder if you have considered adding APIs to catalogs /
> connectors to perform this task as an alternative?
> I could imagine that for many connectors, statistics could be
> implemented in a less expensive way by leveraging the underlying system
> (e.g. a JDBC connector can get a row count estimate without performing a
> SELECT COUNT(1)).
>
>
> Best
> Ingo
>
>
> On 10.06.22 09:53, godfrey he wrote:
> > Hi all,
> >
> > I would like to open a discussion on FLIP-240:  Introduce "ANALYZE
> > TABLE" Syntax.
> >
> > As FLIP-231 mentioned, statistics are one of the most important inputs
> > to the optimizer. Accurate and complete statistics allows the
> > optimizer to be more powerful. "ANALYZE TABLE" syntax is a very common
> > but effective approach to gather statistics, which is already
> > introduced by many compute engines and databases.
> >
> > The main purpose of  discussion is to introduce "ANALYZE TABLE" syntax
> > for Flink sql.
> >
> > You can find more details in FLIP-240 document[1]. Looking forward to
> > your feedback.
> >
> > [1] 
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481
> > [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240
> >
> >
> > Best,
> > Godfrey

Reply via email to