[ 
https://issues.apache.org/jira/browse/FLINK-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222325#comment-16222325
 ] 

Xingcan Cui commented on FLINK-7800:
------------------------------------

Hi [~fhueske], the problem is not as easy as I expected. The key point is that 
once we remove the equi-predicate check in {{FlinkLogicalJoin}}, there will be 
different query plans in the optimization phase. For instance, given the 
following test expressions:
{code}
val joinT = ds1.join(ds2).filter('b === 'h + 1 && 'a - 1 === 'd + 2).select('c, 
'g)
{code}
two kinds of plans will become the candidates: with or without equi-predicates 
in {{LogicalJoin}}. Worse still, the plans without equi-predicates may have a 
lower cost (mainly in terms of IO for the DataSet join), thus be selected as 
the result.

To solve this, we must propose a mechanism to ensure that the plans with 
equi-predicates should always be preferred, regardless of their costs. Maybe 
that can be implemented by adding a "punishment factor" to plans without 
equi-predicates (or "enhancement factor" vice versa), but I am concerned 
whether this may break the existing cost model. Do you have some ideas about 
that?

Best, Xingcan

> Enable window joins without equi-join predicates
> ------------------------------------------------
>
>                 Key: FLINK-7800
>                 URL: https://issues.apache.org/jira/browse/FLINK-7800
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>    Affects Versions: 1.4.0
>            Reporter: Fabian Hueske
>            Assignee: Xingcan Cui
>
> Currently, windowed joins can only be translated if they have at least on 
> equi-join predicate. This limitation exists due to the lack of a good cross 
> join strategy for the DataSet API.
> Due to the window, windowed joins do not have to be executed as cross joins. 
> Hence, the equi-join limitation does not need to be enforces (even though 
> non-equi joins are executed with a parallelism of 1 right now).
> We can resolve this issue by adding a boolean flag to the 
> {{FlinkLogicalJoinConverter}} rule to permit non-equi joins and add such a 
> rule to the logical optimization set of the DataStream API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to