[ https://issues.apache.org/jira/browse/HIVE-8422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167802#comment-14167802 ]
Chao commented on HIVE-8422: ---------------------------- I tested all qfiles with the pattern: *join*.q 1. there are 4 files that failed in execution: {noformat} parquet_join smb_mapjoin_11 smb_mapjoin_12 tez_join_hash {noformat} I will look at the log and perhaps create follow-up JIRAs for those. 2. there are some files of which the output order is not the same as MR outputs: {noformat} auto_join26 bucketmapjoin7 date_join1 join40 skewjoinopt2 vector_decimal_mapjoin vector_mapjoin_reduce auto_join_without_localtask {noformat} I will also create follow-up JIRAs for this issue. 3. for a few tests, the result is not the same. I don't know what exactly is the reason. {noformat} mapjoin1 vectorized_nested_mapjoin {noformat} 4. for {{infer_bucket_sort_convert_join}}, Spark's output file has 0 for both {{numRows}} and {{rawDataSize}}. We may need to investigate this. 5. for a lot of tests, MR results have warning like the following: {noformat} Warning: Map Join MAPJOIN[13][bigTable=b] in task 'Stage-2:MAPRED' is a cross product {noformat} for potential expensive join operator. Perhaps we should do something similar in the Spark branch? > Turn on all join .q tests [Spark Branch] > ---------------------------------------- > > Key: HIVE-8422 > URL: https://issues.apache.org/jira/browse/HIVE-8422 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xuefu Zhang > Assignee: Chao > > With HIVE-8412, all join queries should work on Spark, whether they require a > particular optimization or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)