[ https://issues.apache.org/jira/browse/FLINK-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145947#comment-15145947 ]
ASF GitHub Bot commented on FLINK-3226: --------------------------------------- Github user twalthr commented on a diff in the pull request: https://github.com/apache/flink/pull/1632#discussion_r52827003 --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/plan/rules/dataset/DataSetJoinRule.scala --- @@ -39,18 +49,80 @@ class DataSetJoinRule val convLeft: RelNode = RelOptRule.convert(join.getInput(0), DataSetConvention.INSTANCE) val convRight: RelNode = RelOptRule.convert(join.getInput(1), DataSetConvention.INSTANCE) - new DataSetJoin( - rel.getCluster, - traitSet, - convLeft, - convRight, - rel.getRowType, - join.toString, - Array[Int](), - Array[Int](), - JoinType.INNER, - null, - null) + // get the equality keys + val joinInfo = join.analyzeCondition + val keyPairs = joinInfo.pairs + + if (keyPairs.isEmpty) { // if no equality keys => not supported + throw new TableException("Joins should have at least one equality condition") + } + else { // at least one equality expression => generate a join function + val conditionType = join.getCondition.getType + val func = getJoinFunction(join, joinInfo) + val leftKeys = ArrayBuffer.empty[Int] + val rightKeys = ArrayBuffer.empty[Int] + + keyPairs.foreach(pair => { + leftKeys.add(pair.source) + rightKeys.add(pair.target)} + ) + + new DataSetJoin( + rel.getCluster, + traitSet, + convLeft, + convRight, + rel.getRowType, + join.toString, + leftKeys.toArray, + rightKeys.toArray, + JoinType.INNER, + null, + func) + } + } + + def getJoinFunction(join: FlinkJoin, joinInfo: JoinInfo): + ((TableConfig, TypeInformation[Any], TypeInformation[Any], TypeInformation[Any]) => + FlatJoinFunction[Any, Any, Any]) = { + + if (joinInfo.isEqui) { + // only equality condition => no join function necessary + null --- End diff -- In general, `null` is not very welcome in Scala. Could you return a `FlatJoinFunction` (containing only a `ConverterResultExpression`) here too? We can then get rid of the `Tuple2RowMapper` and support any type as output type of the join operation. > Translate optimized logical Table API plans into physical plans representing > DataSet programs > --------------------------------------------------------------------------------------------- > > Key: FLINK-3226 > URL: https://issues.apache.org/jira/browse/FLINK-3226 > Project: Flink > Issue Type: Sub-task > Components: Table API > Reporter: Fabian Hueske > Assignee: Chengxiang Li > > This issue is about translating an (optimized) logical Table API (see > FLINK-3225) query plan into a physical plan. The physical plan is a 1-to-1 > representation of the DataSet program that will be executed. This means: > - Each Flink RelNode refers to exactly one Flink DataSet or DataStream > operator. > - All (join and grouping) keys of Flink operators are correctly specified. > - The expressions which are to be executed in user-code are identified. > - All fields are referenced with their physical execution-time index. > - Flink type information is available. > - Optional: Add physical execution hints for joins > The translation should be the final part of Calcite's optimization process. > For this task we need to: > - implement a set of Flink DataSet RelNodes. Each RelNode corresponds to one > Flink DataSet operator (Map, Reduce, Join, ...). The RelNodes must hold all > relevant operator information (keys, user-code expression, strategy hints, > parallelism). > - implement rules to translate optimized Calcite RelNodes into Flink > RelNodes. We start with a straight-forward mapping and later add rules that > merge several relational operators into a single Flink operator, e.g., merge > a join followed by a filter. Timo implemented some rules for the first SQL > implementation which can be used as a starting point. > - Integrate the translation rules into the Calcite optimization process -- This message was sent by Atlassian JIRA (v6.3.4#6332)