wangmeng created HIVE-10971: ------------------------------- Summary: count(*) with count(distinct) gives wrong results when hive.groupby.skewindata=true Key: HIVE-10971 URL: https://issues.apache.org/jira/browse/HIVE-10971 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: wangmeng Assignee: wangmeng
When hive.groupby.skewindata=true, the following query based on TPC-H gives wrong results: {code} set hive.groupby.skewindata=true; select l_returnflag, count(*), count(distinct l_linestatus) from lineitem group by l_returnflag limit 10; {code} The query plan shows that it generates only one MapReduce job instead of two, which is dictated by hive.groupby.skewindata=true. The problem arises only when {noformat}count(*){noformat} and {noformat}count(distinct){noformat} exist together. -- This message was sent by Atlassian JIRA (v6.3.4#6332)