Hi godfrey, thanks for your detail explanation. After explaining and glancing over the FLIP-231, I think it is really need, +1 for this and looking forward to it.
best zoucao godfrey he <godfre...@gmail.com> 于2022年6月13日周一 14:43写道: > Hi Ingo, > > The semantics does not distinguish batch and streaming, > It works for both batch and streaming, but the result of > unbounded sources is meaningless. > Currently, I throw exception for streaming mode, > and we can support streaming mode with bounded source > in the future. > > Best, > Godfrey > > Ingo Bürk <airbla...@apache.org> 于2022年6月13日周一 14:17写道: > > > > Hi Godfrey, > > > > thank you for the explanation. A SELECT is definitely more generic and > > will work for all connectors automatically. As such I think it's a good > > baseline solution regardless. > > > > We can also think about allowing connector-specific optimizations in the > > future, but I do like your idea of letting the optimizer rules perform a > > lot of the work here already by leveraging existing optimizations. > > Similarly things like non-null counts of non-nullable columns would (or > > at least could) be handled by the optimizer rules already. > > > > So as far as that point goes, +1 to the generic approach. > > > > One more point, though: In general we should avoid supporting features > > only in specific modes as it breaks the unification promise. Given that > > ANALYZE is a manual and completely optional operation I'm OK with doing > > that here in principle. However, I wonder what will happen in the > > streaming / unbounded case. Do you plan to throw an error? Or do we > > complete the command as successful but without doing anything? > > > > > > Best > > Ingo > > > > On 13.06.22 05:50, godfrey he wrote: > > > Hi Ingo, > > > > > > Thanks for the inputs. > > > > > > I think converting `ANALYZE TABLE` to `SELECT` statement is > > > more generic approach. Because query plan optimization is more generic, > > > we can provide more optimization rules to optimize not only `SELECT` > statement > > > converted from `ANALYZE TABLE` but also the `SELECT` statement written > by users. > > > > > >> JDBC connector can get a row count estimate without performing a > > >> SELECT COUNT(1) > > > To optimize such cases, we can implement a rule to push aggregate into > > > table source. > > > Currently, there is a similar rule: SupportsAggregatePushDown, which > > > supports only pushing > > > local aggregate into source now. > > > > > > > > > Best, > > > Godfrey > > > > > > Ingo Bürk <airbla...@apache.org> 于2022年6月10日周五 17:15写道: > > >> > > >> Hi Godfrey, > > >> > > >> compared to the solution proposed in the FLIP (using a SELECT > > >> statement), I wonder if you have considered adding APIs to catalogs / > > >> connectors to perform this task as an alternative? > > >> I could imagine that for many connectors, statistics could be > > >> implemented in a less expensive way by leveraging the underlying > system > > >> (e.g. a JDBC connector can get a row count estimate without > performing a > > >> SELECT COUNT(1)). > > >> > > >> > > >> Best > > >> Ingo > > >> > > >> > > >> On 10.06.22 09:53, godfrey he wrote: > > >>> Hi all, > > >>> > > >>> I would like to open a discussion on FLIP-240: Introduce "ANALYZE > > >>> TABLE" Syntax. > > >>> > > >>> As FLIP-231 mentioned, statistics are one of the most important > inputs > > >>> to the optimizer. Accurate and complete statistics allows the > > >>> optimizer to be more powerful. "ANALYZE TABLE" syntax is a very > common > > >>> but effective approach to gather statistics, which is already > > >>> introduced by many compute engines and databases. > > >>> > > >>> The main purpose of discussion is to introduce "ANALYZE TABLE" > syntax > > >>> for Flink sql. > > >>> > > >>> You can find more details in FLIP-240 document[1]. Looking forward to > > >>> your feedback. > > >>> > > >>> [1] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386481 > > >>> [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-240 > > >>> > > >>> > > >>> Best, > > >>> Godfrey >