[ https://issues.apache.org/jira/browse/HIVE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chetna Chaudhari updated HIVE-11735: ------------------------------------ Description: Hive if() udf is returns different results when string equality is used as condition, with case change. Observation: 1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as equal. 2) The rightmost udf result is pushed to predicates on left side. Leading to same result for both the udfs. How to reproduce the issue: 1) CREATE TABLE `sample`( `name` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' TBLPROPERTIES ( 'transient_lastDdlTime'='1425075745'); 2) insert into table sample values ('chetna'); 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3)) from sample; This will give result : 3 3 Expected result: 4 3 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3)) from sample; This will give result 4 4 Expected result: 3 4 was: Hive if() udf is returning different results when string equality is used as condition, with case change. Observation: 1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are treated as equal. 2) The rightmost udf result is pushed to predicates on left side. Leading to same result for both the udfs. How to reproduce the issue: 1) CREATE TABLE `sample`( `name` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' TBLPROPERTIES ( 'transient_lastDdlTime'='1425075745'); 2) insert into table sample values ('chetna'); 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3)) from sample; This will give result : 3 3 Expected result: 4 3 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3)) from sample; This will give result 4 4 Expected result: 3 4 > Different results when multiple if() functions are used > -------------------------------------------------------- > > Key: HIVE-11735 > URL: https://issues.apache.org/jira/browse/HIVE-11735 > Project: Hive > Issue Type: Bug > Reporter: Chetna Chaudhari > > Hive if() udf is returns different results when string equality is used as > condition, with case change. > Observation: > 1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are > treated as equal. > 2) The rightmost udf result is pushed to predicates on left side. Leading > to same result for both the udfs. > How to reproduce the issue: > 1) CREATE TABLE `sample`( > `name` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1425075745'); > 2) insert into table sample values ('chetna'); > 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3)) from > sample; > This will give result : > 3 3 > Expected result: > 4 3 > 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3)) from > sample; > This will give result > 4 4 > Expected result: > 3 4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)