Harper Koo created HIVE-13375: --------------------------------- Summary: Cannot get the right count(*) through 'group by' Key: HIVE-13375 URL: https://issues.apache.org/jira/browse/HIVE-13375 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 1.1.0 Environment: hive 1.1 under cloudera 5.4.0 Reporter: Harper Koo
I have a query looks like: select b.rd ,b.type ,null as pv ,count(*) as uv from( select rd ,type ,user from (select rd ,case when url like '%abc' then 'A' when url like '%abc1' then 'B' when url like '%abc2' then 'C' end as type ,nvl(id,guid) as user from tableA pq where rd='2016-03-08' and url like '%abc%' )t group by rd,type,user order by rd )b group by b.rd,b.type; I want to get the unique 'user' from tableA without using count(distinct()), but the result surprised me! the result looks like: 2016-03-08 A NULL 245 2016-03-08 B NULL 53 2016-03-08 C NULL 6 2016-03-08 A NULL 543 2016-03-08 B NULL 456 2016-03-08 C NULL 25 2016-03-08 A NULL 544 2016-03-08 B NULL 65 2016-03-08 C NULL 34 …… It is quite long and does not group by rd and type at all. I run the same query in impala and it works well there and gives me the right answer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)