Hi,

Currently the biggest limitation that prevents better query optimisation is 
lack of table statistics (which are not trivial to provide in streaming), thus 
Joins/Aggregation reordering doesn’t work. We have some ideas how to tackle 
this issue and definitely at some point of time we will improve this.

Piotrek

> On 14 Jul 2018, at 06:48, Xingcan Cui <xingc...@gmail.com> wrote:
> 
> Hi Albert,
> 
> Calcite provides a rule-based optimizer (as a framework), which means users 
> can customize it by adding rules. That’s exactly what Flink did. From the 
> logical plan to the physical plan, the translations are triggered by 
> different sets of rules, according to which the relational expressions are 
> replaced, reordered or optimized.
> 
> However, IMO, the current optimization rules in Flink Table API are quite 
> primal. Some SQL statements (e.g., multiple joins) are just translated to 
> feasible execution plans, instead of optimized ones, since it’s much more 
> difficult to conduct query optimization on large datasets or dynamic streams. 
> You could first start from the Calcite query optimizer, and then try to make 
> your own rules.
> 
> Best,
> Xingcan
> 
>> On Jul 14, 2018, at 11:55 AM, vino yang <yanghua1...@gmail.com> wrote:
>> 
>> Hi Albert,
>> 
>> First I guess the query optimizer you mentioned is about Flink table & sql
>> (for batch API there is another optimizer which is implemented by Flink).
>> 
>> Yes, now for table & sql, Flink use Apache Calcite's query optimizer to
>> translate into a Calcite plan
>> which is then optimized according to Calcite's optimization rules.
>> 
>> The following rules are applied so far:
>> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/rules/FlinkRuleSets.scala
>> 
>> In view of Flink depends on the Calcite to do the optimization, I think
>> enhance Flink and Calcite would be the right direction.
>> 
>> Hope for you provide more idea and details. Flink community welcome your
>> idea and contribution.
>> 
>> Thanks.
>> Vino.
>> 
>> 
>> 2018-07-13 23:39 GMT+08:00 Albert Jonathan <alb...@cs.umn.edu>:
>> 
>>> Hello,
>>> 
>>> I am just wondering, does Flink use Apache Calcite's query optimizer to
>>> generate an optimal logical plan, or does it have its own query optimizer?
>>> From what I observed so far, the Flink's query optimizer only groups
>>> operator together without changing the order of aggregation operators
>>> (e.g., join). Did I miss anything?
>>> 
>>> I am thinking of extending Flink to apply query optimization as in the
>>> RDBMS by either integrating it with Calcite or implementing it as a new
>>> module.
>>> Any feedback or guidelines will be highly appreciated.
>>> 
>>> Thank you,
>>> Albert
>>> 
> 

Reply via email to