[ https://issues.apache.org/jira/browse/FLINK-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187397#comment-16187397 ]
Xingcan Cui commented on FLINK-7730: ------------------------------------ Hi [~fhueske], I've checked the implementation and got some clues in the SQL level. For instance, given a simple table {{WordCount(word:String, frequency:Int)}}, a table function {{split: word:String => (letter:String, length:String)}}, and a SQL like that {code:sql} SELECT word, name, length FROM WordCount LEFT JOIN LATERAL TABLE(split(word)) AS T (letter, length) ON frequency = length OR length < 5 {code} , the query will be translated to the logical plan below. {code} LogicalProject(word=[$0], name=[$2], length=[$3]) LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))]) LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0}]) LogicalTableScan(table=[[WordCount]]) LogicalTableFunctionScan(invocation=[split($cor0.word)], rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class [Ljava.lang.Object;]) {code} Apparently, this logical plan will lead to an improper physical plan, which first correlates each row with its table function results in the {{CorrelateFlatMapRunner}} (just like performing a cartesian product) and then filters the rows in a {{FlatMapFunction}}. It only works for inner join, but not for left outer join. Under this circumstance, could you give some suggestions on fixing this? BTW, I find it's hard to debug the dynamically generated functions (the IDE can not locate the source), is there some special techniques that can be used to deal with them? > TableFunction LEFT OUTER joins with ON predicates are broken > ------------------------------------------------------------ > > Key: FLINK-7730 > URL: https://issues.apache.org/jira/browse/FLINK-7730 > Project: Flink > Issue Type: Bug > Components: Table API & SQL > Affects Versions: 1.4.0, 1.3.2 > Reporter: Fabian Hueske > Assignee: Xingcan Cui > > TableFunction left outer joins with predicates in the ON clause are broken. > Apparently, the are no tests for this and it has never worked. I observed > issues on several layers: > - Table Function does not correctly validate equality predicate: > {{leftOuterJoin(func1('c) as 'd, 'a.cast(Types.STRING) === 'd)}} is rejected > because the predicate is not considered as an equality predicate (the cast > needs to be pushed down). > - Plans cannot be correctly translated: {{leftOuterJoin(func1('c) as 'd, 'c > === 'd)}} gives an optimizer exception. > - SQL queries get translated but produce incorrect results. For example > {{SELECT a, b, c, d FROM MyTable LEFT OUTER JOIN LATERAL TABLE(tfunc(c)) AS > T(d) ON d = c}} returns an empty result if the condition {{d = c}} never > returns true. However, the outer side should be preserved and padded with > nulls. > So there seem to be many issues with table function outer joins. Especially, > the wrong result produced by SQL queries need to be quickly fixed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)