Xinyu Wang created HIVE-24153: --------------------------------- Summary: distinct is not quite effective in table expression Key: HIVE-24153 URL: https://issues.apache.org/jira/browse/HIVE-24153 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 3.1.1 Reporter: Xinyu Wang
Below is an example: _t(id int, name string, comment string)._ _with cte as (_ _select distinct id, name, comment_ _from t_ _)_ _select count(*) from cte_ The result of the above query is larger than select count(distinct id, name, comment). In the result of EXPLAIN, PARTITION_ONLY_SHUFFLE is used. But for select count(distinct id, name, comment), SHUFFLE is used instead. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005)