[ https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vineet Garg updated HIVE-14442: ------------------------------- Attachment: HIVE-14442.3.patch > CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong > result/plan in group by with hive.map.aggr=false > ------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-14442 > URL: https://issues.apache.org/jira/browse/HIVE-14442 > Project: Hive > Issue Type: Sub-task > Components: CBO > Reporter: Vineet Garg > Assignee: Vineet Garg > Attachments: HIVE-14442.1.patch, HIVE-14442.2.patch, > HIVE-14442.3.patch > > > Reproducer > {code} set hive.cbo.returnpath.hiveop=true > set hive.map.aggr=false > create table abcd (a int, b int, c int, d int); > LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd; > {code} > {code} explain select count(distinct a) from abcd group by b; {code} > {code} > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: abcd > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: a (type: int) > outputColumnNames: a > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: a (type: int), a (type: int) > sort order: ++ > Map-reduce partition columns: a (type: int) > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Reduce Operator Tree: > Group By Operator > aggregations: count(DISTINCT KEY._col1:0._col0) > keys: KEY._col0 (type: int) > mode: complete > outputColumnNames: b, $f1 > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: $f1 (type: bigint) > outputColumnNames: _o__c0 > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE > Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} > {code} explain select count(distinct a) from abcd group by c; {code} > {code} > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: abcd > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: a (type: int) > outputColumnNames: a > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: a (type: int), a (type: int) > sort order: ++ > Map-reduce partition columns: a (type: int) > Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE > Column stats: NONE > Reduce Operator Tree: > Group By Operator > aggregations: count(DISTINCT KEY._col1:0._col0) > keys: KEY._col0 (type: int) > mode: complete > outputColumnNames: c, $f1 > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: $f1 (type: bigint) > outputColumnNames: _o__c0 > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE > Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.SequenceFileInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > {code} > Above two cases has wrong keys in Map side Reduce Output Operator (both has > a, a instead of b,a and c,a respectively -- This message was sent by Atlassian JIRA (v6.3.4#6332)