[ https://issues.apache.org/jira/browse/HIVE-23833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chuanjie.duan updated HIVE-23833: --------------------------------- Component/s: Hive Affects Version/s: 2.1.1 Description: Reproduce: # Create three tables, mytest_t1, mytest_t2, mytest_t4 # hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = t2.material_code;" # hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = t2.material_code {color:#FF0000}join dw_dev.mytest_t5 t5 on t5.material_code = coalesce(t1.material_code,t2.material_code){color};" 2 - explain Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: material_code (type: string), wh_guid (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string) TableScan alias: t2 Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: material_code (type: string), wh_guid (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string) Reduce Operator Tree: Join Operator condition map: {color:#FF0000}Outer Join 0 to 1{color} keys: 0 _col0 (type: string) 1 _col0 (type: string) outputColumnNames: _col1, _col3 Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: COALESCE(_col1,_col3) (type: string) outputColumnNames: _col0 Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 3 - explain STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias -> Map Local Tables: $hdt$_1:t2 Fetch Operator limit: -1 $hdt$_2:t5 Fetch Operator limit: -1 Alias -> Map Local Operator Tree: $hdt$_1:t2 TableScan alias: t2 Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: material_code is not null (type: boolean) Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: material_code (type: string), wh_guid (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col0 (type: string) 1 _col0 (type: string) $hdt$_2:t5 TableScan alias: t5 Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: material_code is not null (type: boolean) Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: material_code (type: string) outputColumnNames: _col0 Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 COALESCE(_col0,_col2) (type: string) 1 _col0 (type: string) Stage: Stage-5 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: material_code is not null (type: boolean) Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: material_code (type: string), wh_guid (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: {color:red} Inner Join 0 to 1{color} keys: 0 _col0 (type: string) 1 _col0 (type: string) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: {color:red} Inner Join 0 to 1{color} keys: 0 COALESCE(_col0,_col2) (type: string) 1 _col0 (type: string) outputColumnNames: _col1, _col3 Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: COALESCE(_col1,_col3) (type: string) outputColumnNames: _col0 Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Summary: wrong explain and result when full join with join (was: hive) > wrong explain and result when full join with join > ------------------------------------------------- > > Key: HIVE-23833 > URL: https://issues.apache.org/jira/browse/HIVE-23833 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 2.1.1 > Reporter: chuanjie.duan > Priority: Major > > Reproduce: > # Create three tables, mytest_t1, mytest_t2, mytest_t4 > # hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from > dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = > t2.material_code;" > # hive -e "explain select coalesce(t1.wh_guid,t2.wh_guid) as wh_guid from > dw_dev.mytest_t1 t1 full join dw_dev.mytest_t2 t2 on t1.material_code = > t2.material_code {color:#FF0000}join dw_dev.mytest_t5 t5 on t5.material_code > = coalesce(t1.material_code,t2.material_code){color};" > 2 - explain > Map Reduce > Map Operator Tree: > TableScan > alias: t1 > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: material_code (type: string), wh_guid (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > Reduce Output Operator > key expressions: _col0 (type: string) > sort order: + > Map-reduce partition columns: _col0 (type: string) > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > value expressions: _col1 (type: string) > TableScan > alias: t2 > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: material_code (type: string), wh_guid (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > Reduce Output Operator > key expressions: _col0 (type: string) > sort order: + > Map-reduce partition columns: _col0 (type: string) > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > value expressions: _col1 (type: string) > Reduce Operator Tree: > Join Operator > condition map: > {color:#FF0000}Outer Join 0 to 1{color} > keys: > 0 _col0 (type: string) > 1 _col0 (type: string) > outputColumnNames: _col1, _col3 > Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: COALESCE(_col1,_col3) (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column > stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column > stats: NONE > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > 3 - explain > STAGE PLANS: > Stage: Stage-7 > Map Reduce Local Work > Alias -> Map Local Tables: > $hdt$_1:t2 > Fetch Operator > limit: -1 > $hdt$_2:t5 > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > $hdt$_1:t2 > TableScan > alias: t2 > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: material_code is not null (type: boolean) > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: material_code (type: string), wh_guid (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 1201 Data size: 259416 Basic stats: COMPLETE Column > stats: NONE > HashTable Sink Operator > keys: > 0 _col0 (type: string) > 1 _col0 (type: string) > $hdt$_2:t5 > TableScan > alias: t5 > Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: material_code is not null (type: boolean) > Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: material_code (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 12927 Data size: 2430276 Basic stats: COMPLETE Column > stats: NONE > HashTable Sink Operator > keys: > 0 COALESCE(_col0,_col2) (type: string) > 1 _col0 (type: string) > Stage: Stage-5 > Map Reduce > Map Operator Tree: > TableScan > alias: t1 > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > Filter Operator > predicate: material_code is not null (type: boolean) > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: material_code (type: string), wh_guid (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6159 Data size: 1724520 Basic stats: COMPLETE Column > stats: NONE > Map Join Operator > condition map: > {color:red} Inner Join 0 to 1{color} > keys: > 0 _col0 (type: string) > 1 _col0 (type: string) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 6774 Data size: 1896972 Basic stats: COMPLETE Column > stats: NONE > Map Join Operator > condition map: > {color:red} Inner Join 0 to 1{color} > keys: > 0 COALESCE(_col0,_col2) (type: string) > 1 _col0 (type: string) > outputColumnNames: _col1, _col3 > Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column > stats: NONE > Select Operator > expressions: COALESCE(_col1,_col3) (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column > stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 14219 Data size: 2673303 Basic stats: COMPLETE Column > stats: NONE > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe -- This message was sent by Atlassian Jira (v8.3.4#803005)