[ https://issues.apache.org/jira/browse/HIVE-13737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15302294#comment-15302294 ]
frank luo commented on HIVE-13737: ---------------------------------- hive> explain INSERT INTO TABLE test SELECT * from src UNION ALL SELECT * from src; OK Plan not optimized by CBO. Vertex dependency in root stage Map 1 <- Union 2 (CONTAINS) Map 3 <- Union 2 (CONTAINS) Stage-4 Stats-Aggr Operator Stage-0 Move Operator table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} Stage-2 Dependency Collection{} Stage-1 Union 2 |<-Map 1 [CONTAINS] | File Output Operator [FS_6] | compressed:false | Statistics:Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE | table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} | Select Operator [SEL_1] | outputColumnNames:["_col0"] | Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE | TableScan [TS_0] | alias:src | Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE |<-Map 3 [CONTAINS] File Output Operator [FS_6] compressed:false Statistics:Num rows: 2 Data size: 2 Basic stats: COMPLETE Column stats: NONE table:{"serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe","name:":"jluo.test","input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"} Select Operator [SEL_3] outputColumnNames:["_col0"] Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE TableScan [TS_2] alias:src Statistics:Num rows: 1 Data size: 1 Basic stats: COMPLETE Column stats: NONE Stage-3 Stats-Aggr Operator Please refer to the previous Stage-0 Time taken: 0.088 seconds, Fetched: 41 row(s) hive> explain SELECT count(*) FROM test; OK Plan not optimized by CBO. Stage-0 Fetch Operator limit:1 Time taken: 0.037 seconds, Fetched: 6 row(s) > incorrect count when multiple inserts with union all > ---------------------------------------------------- > > Key: HIVE-13737 > URL: https://issues.apache.org/jira/browse/HIVE-13737 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.2.1 > Environment: hdp 2.3.4.7 on Red Hat 6 > Reporter: Frank Luo > Priority: Critical > > Here is a test case to illustrate the issue. It seems MR works fine but Tez > is having the problem. > CREATE TABLE test(col1 STRING); > CREATE TABLE src (col1 string); > insert into table src values ('a'); > INSERT into TABLE test > select * from ( > SELECT * from src > UNION ALL > SELECT * from src) x; > -- do it one more time > INSERT INTO TABLE test > SELECT * from src > UNION ALL > SELECT * from src; > --below gives correct result > SELECT * FROM TEST; > --count is incorrect. It might give either '1' or '2', but I am expecting '4' > SELECT count (*) FROM test; -- This message was sent by Atlassian JIRA (v6.3.4#6332)