[ https://issues.apache.org/jira/browse/FLINK-26505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
luoyuxia updated FLINK-26505: ----------------------------- Description: It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" with the following sql which can be executed in Hive: {code:java} select count(1) from (select key from t1 where key = 0) t1 left semi join (select key from t2 where key = 0) t2 on 1 = 1; {code} >From the source code, it will call `RelOptUtil.splitJoinCondition` to split >join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed in >the method: {code:java} public static RexNode splitJoinCondition( List<RelDataTypeField> sysFieldList, RelNode leftRel, RelNode rightRel, RexNode condition, List<RexNode> leftJoinKeys, List<RexNode> rightJoinKeys, List<Integer> filterNulls, List<SqlOperator> rangeOp) {code} But when meet the case that there's no equality join keys found, the behavior will be wired that the `leftJoinKeys` will contain two RexNodes and the `rightJoinKeys` contains nothing. Then it'll cause the exception for we expect the size of `leftJoinKeys` is equal to `rightJoinKeys`. It seems a issue of Calcite [CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032]. To fix it, just following what hive has done, we can rewrite the method `RelOptUtil.splitJoinCondition` to fix it. And what's more, it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive [HIVE-17766|https://issues.apache.org/jira/browse/HIVE-17766]. was: It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" with the following sql which can be executed in Hive: {code:java} select count(1) from (select key from t1 where key = 0) t1 left semi join (select key from t2 where key = 0) t2 on 1 = 1; {code} >From the source code, it will call `RelOptUtil.splitJoinCondition` to split >join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed in >the method: {code:java} public static RexNode splitJoinCondition( List<RelDataTypeField> sysFieldList, RelNode leftRel, RelNode rightRel, RexNode condition, List<RexNode> leftJoinKeys, List<RexNode> rightJoinKeys, List<Integer> filterNulls, List<SqlOperator> rangeOp) {code} But when meet the case that there's no equality join keys found, the behavior will be wired that the `leftJoinKeys` will contain two RexNodes and the `rightJoinKeys` contains nothing. Then it'll cause the exception for we expect the size of `leftJoinKeys` is equal to `rightJoinKeys`. It seems a issue of Calcite [CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032]. To fix it, just following what hive has done, we can rewrite the method `RelOptUtil.splitJoinCondition` to fix it. And what's more, it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive. > Support non equality condition for left semi join in Hive dialect > ----------------------------------------------------------------- > > Key: FLINK-26505 > URL: https://issues.apache.org/jira/browse/FLINK-26505 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive > Reporter: luoyuxia > Priority: Major > Fix For: 1.15.0 > > > It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: > 0" with the following sql which can be executed in Hive: > {code:java} > select count(1) > from > (select key > from t1 > where key = 0) t1 > left semi join > (select key > from t2 > where key = 0) t2 > on 1 = 1; > {code} > From the source code, it will call `RelOptUtil.splitJoinCondition` to split > join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed > in the method: > {code:java} > public static RexNode splitJoinCondition( > List<RelDataTypeField> sysFieldList, > RelNode leftRel, > RelNode rightRel, > RexNode condition, > List<RexNode> leftJoinKeys, > List<RexNode> rightJoinKeys, > List<Integer> filterNulls, > List<SqlOperator> rangeOp) > {code} > But when meet the case that there's no equality join keys found, the behavior > will be wired that the `leftJoinKeys` will contain two RexNodes and the > `rightJoinKeys` contains nothing. Then it'll cause the exception for we > expect the size of `leftJoinKeys` is equal to `rightJoinKeys`. > It seems a issue of Calcite > [CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032]. > To fix it, just following what hive has done, we can rewrite the method > `RelOptUtil.splitJoinCondition` to fix it. And what's more, > it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive > [HIVE-17766|https://issues.apache.org/jira/browse/HIVE-17766]. -- This message was sent by Atlassian Jira (v8.20.1#820001)