[ https://issues.apache.org/jira/browse/HIVE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14962009#comment-14962009 ]
Ashutosh Chauhan commented on HIVE-11735: ----------------------------------------- I think problem here stems from {code} aggregations.put(expressionTree.toStringTree().toLowerCase(), expressionTree); {code} I think for your particular query if you remove {{toLowerCase()}} would solve your problem. Do you really need other changes for column aliases and such in RR? Intent for this map is to detect duplicate functions in aggregations, so that we are not computing them twice. However, this is blindly doing {{toLoweCase()}} on full expression Tree, ignoring the fact that there might be constant literals in there. There are two possible solutions here : * Eliminate this logic altogether from this phase. Don't bother about duplicates in phase 1 analysis. Instead write a rule either on Calcite operator tree or Hive operator tree which walks on expressions and detects duplicates and fixes up operator tree to refer to 1 expression tree. * Write a utility function which takes expression tree as an argument and returns lower case version of its string tree, while leaving constant string literals in original case. Then use this string representation as a key in that map. IMHO, Option 1 is a cleaner approach. However, that might be a big change touching various pieces in planning. Option 2 is much more local and contained change, but kinda inelegant. cc: [~jpullokkaran] if he has other ideas. > Different results when multiple if() functions are used > -------------------------------------------------------- > > Key: HIVE-11735 > URL: https://issues.apache.org/jira/browse/HIVE-11735 > Project: Hive > Issue Type: Bug > Affects Versions: 0.14.0, 1.0.0, 1.1.1, 1.2.1 > Reporter: Chetna Chaudhari > Assignee: Chetna Chaudhari > Attachments: HIVE-11735.patch > > > Hive if() udf is returns different results when string equality is used as > condition, with case change. > Observation: > 1) if( name = 'chetna' , 3, 4) and if( name = 'Chetna', 3, 4) both are > treated as equal. > 2) The rightmost udf result is pushed to predicates on left side. Leading > to same result for both the udfs. > How to reproduce the issue: > 1) CREATE TABLE `sample`( > `name` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1425075745'); > 2) insert into table sample values ('chetna'); > 3) select min(if(name = 'chetna', 4, 3)) , min(if(name='Chetna', 4, 3)) from > sample; > This will give result : > 3 3 > Expected result: > 4 3 > 4) select min(if(name = 'Chetna', 4, 3)) , min(if(name='chetna', 4, 3)) from > sample; > This will give result > 4 4 > Expected result: > 3 4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)