[ https://issues.apache.org/jira/browse/HIVE-15272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15691127#comment-15691127 ]
Xuefu Zhang commented on HIVE-15272: ------------------------------------ Just curious: what's the difference? Is that just a matter of ordering? Hive doesn't guarantee ordering w/o order by clause. MR may do that, but that's just a matter of implementation. If there is a data issue, a complete repro case would be more helpful. > "LEFT OUTER JOIN" Is not populating different records with Hive On Spark > ------------------------------------------------------------------------ > > Key: HIVE-15272 > URL: https://issues.apache.org/jira/browse/HIVE-15272 > Project: Hive > Issue Type: Bug > Components: Hive, Spark > Affects Versions: 1.1.0 > Environment: Hive 1.1.0, CentOS, Cloudera 5.7.4 > Reporter: Vikash Pareek > > Following query is populating different result every time I ran with Hive on > Spark: > {code} > SELECT COUNT(*) > FROM > (SELECT DISTINCT mt1.name, > mt1.id > FROM > (SELECT mt1.*, > mt2.region, > mt2., > regexp_replace(mt2.tr_dat,"\\.","") AS TRANSACTION_DATE > FROM my_database.my_table1 mt1 > LEFT OUTER JOIN my_database.my_table2 mt2 ON (mt1.id=mt2.id > AND mt1.name = > mt2.name))t6)A; > {code} > But the same query populating same result with Hive on MapReduce every time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)