[ 
https://issues.apache.org/jira/browse/FLINK-26505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luoyuxia updated FLINK-26505:
-----------------------------
    Description: 
It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" 
with the following sql which can be executed in Hive:
{code:java}
select count(1)
from
  (select key
  from t1
  where key = 0) t1
left semi join
  (select key
  from t2
  where key = 0) t2
on 1 = 1;
{code}
>From the source code, it will call `RelOptUtil.splitJoinCondition`  to split 
>join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed in 
>the method:
{code:java}
  public static RexNode splitJoinCondition(
      List<RelDataTypeField> sysFieldList,
      RelNode leftRel,
      RelNode rightRel,
      RexNode condition,
      List<RexNode> leftJoinKeys,
      List<RexNode> rightJoinKeys,
      List<Integer> filterNulls,
      List<SqlOperator> rangeOp)
{code}

But when meet the case that there's no equality join keys found, the behavior 
will be wired that the `leftJoinKeys` will contain two RexNodes and the 
`rightJoinKeys` contains nothing. Then it'll cause the exception for we expect 
the size of `leftJoinKeys` is equal to `rightJoinKeys`.
It seems a issue of Calcite 
[CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032].

To fix it,  just following what hive has done,  we can rewrite the method 
`RelOptUtil.splitJoinCondition` to fix it. And what's more, 
it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive 
[HIVE-17766|https://issues.apache.org/jira/browse/HIVE-17766].



  was:
It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: 0" 
with the following sql which can be executed in Hive:
{code:java}
select count(1)
from
  (select key
  from t1
  where key = 0) t1
left semi join
  (select key
  from t2
  where key = 0) t2
on 1 = 1;
{code}
>From the source code, it will call `RelOptUtil.splitJoinCondition`  to split 
>join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed in 
>the method:
{code:java}
  public static RexNode splitJoinCondition(
      List<RelDataTypeField> sysFieldList,
      RelNode leftRel,
      RelNode rightRel,
      RexNode condition,
      List<RexNode> leftJoinKeys,
      List<RexNode> rightJoinKeys,
      List<Integer> filterNulls,
      List<SqlOperator> rangeOp)
{code}

But when meet the case that there's no equality join keys found, the behavior 
will be wired that the `leftJoinKeys` will contain two RexNodes and the 
`rightJoinKeys` contains nothing. Then it'll cause the exception for we expect 
the size of `leftJoinKeys` is equal to `rightJoinKeys`.
It seems a issue of Calcite 
[CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032].

To fix it,  just following what hive has done,  we can rewrite the method 
`RelOptUtil.splitJoinCondition` to fix it. And what's more, 
it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive.




> Support non equality condition for left semi join in Hive dialect
> -----------------------------------------------------------------
>
>                 Key: FLINK-26505
>                 URL: https://issues.apache.org/jira/browse/FLINK-26505
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / Hive
>            Reporter: luoyuxia
>            Priority: Major
>             Fix For: 1.15.0
>
>
> It'll throw exception "java.lang.IndexOutOfBoundsException: Index: 0, Size: 
> 0" with the following sql which can be executed in Hive:
> {code:java}
> select count(1)
> from
>   (select key
>   from t1
>   where key = 0) t1
> left semi join
>   (select key
>   from t2
>   where key = 0) t2
> on 1 = 1;
> {code}
> From the source code, it will call `RelOptUtil.splitJoinCondition`  to split 
> join condition and initialize the `leftJoinKeys` and `rightJoinKeys` passed 
> in the method:
> {code:java}
>   public static RexNode splitJoinCondition(
>       List<RelDataTypeField> sysFieldList,
>       RelNode leftRel,
>       RelNode rightRel,
>       RexNode condition,
>       List<RexNode> leftJoinKeys,
>       List<RexNode> rightJoinKeys,
>       List<Integer> filterNulls,
>       List<SqlOperator> rangeOp)
> {code}
> But when meet the case that there's no equality join keys found, the behavior 
> will be wired that the `leftJoinKeys` will contain two RexNodes and the 
> `rightJoinKeys` contains nothing. Then it'll cause the exception for we 
> expect the size of `leftJoinKeys` is equal to `rightJoinKeys`.
> It seems a issue of Calcite 
> [CALCITE-5032|https://issues.apache.org/jira/browse/CALCITE-5032].
> To fix it,  just following what hive has done,  we can rewrite the method 
> `RelOptUtil.splitJoinCondition` to fix it. And what's more, 
> it makes it's possible to support non-equi LEFT SEMI JOIN supported in Hive 
> [HIVE-17766|https://issues.apache.org/jira/browse/HIVE-17766].



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to