Li Xin created HIVE-9613: ---------------------------- Summary: Left join query plan outputs wrong column when using subquery Key: HIVE-9613 URL: https://issues.apache.org/jira/browse/HIVE-9613 Project: Hive Issue Type: Bug Components: Parser, Query Planning Affects Versions: 0.14.0, 1.0.0 Environment: apache hadoop 2.5.1 Reporter: Li Xin
I have a query that outputs a column with wrong contents,which equals to another column. I have three tables,as follows: table 1: _hivetemp.category_city_rank_: ||category||city||rank|| |jinrongfuwu|shanghai|1| |ktvjiuba|shanghai|2| table 2:_hivetemp.category_match_: ||src_category_en||src_category_cn||dst_category_en||dst_category_cn|| |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu| |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba| table 3:_hivetemp.city_match_: ||src_city_name_en||dst_city_name_en||city_name_cn|| |sh|shanghai|上海| And the query is : {code} select a.category, a.city, a.rank, b.src_category_en, c.src_city_name_en from hivetemp.category_city_rank a left outer join (select src_category_en, dst_category_en from hivetemp.category_match) b on a.category = b.dst_category_en left outer join (select src_city_name_en, dst_city_name_en from hivetemp.city_match) c on a.city = c.dst_city_name_en {code} which shoud output the results as follows,and i test it in hive 0.13: ||category||city||rank||src_category_en||src_city_name_en|| |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh| |ktvjiuba|shanghai|2|zpwentiyingshi|sh| but int hive0.14,the results in the column *src_category_en* is wrong,and is just the *city* contents: ||category||city||rank||src_category_en||src_city_name_en|| |jinrongfuwu|shanghai|1|shanghai|sh| |ktvjiuba|shanghai|2|shanghai|sh| Using explain to examine the execution plan,i can see the first subquery just outputs one column of *dst_category_en*,and *src_category_en* is just missing. {quote} b:category_match TableScan alias: category_match Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: dst_category_en (type: string) outputColumnNames: _col1 Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE Column stats: NONE {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)