Zoltan Haindrich created HIVE-26135:
---------------------------------------

             Summary: Invalid Anti join conversion may cause missing results
                 Key: HIVE-26135
                 URL: https://issues.apache.org/jira/browse/HIVE-26135
             Project: Hive
          Issue Type: Bug
            Reporter: Zoltan Haindrich
            Assignee: Zoltan Haindrich


right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}



workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to