[ 
https://issues.apache.org/jira/browse/HIVE-8422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167802#comment-14167802
 ] 

Chao commented on HIVE-8422:
----------------------------

I tested all qfiles with the pattern: *join*.q
1. there are 4 files that failed in execution:
{noformat}
parquet_join
smb_mapjoin_11
smb_mapjoin_12
tez_join_hash
{noformat}
I will look at the log and perhaps create follow-up JIRAs for those.

2. there are some files of which the output order is not the same as MR outputs:
{noformat}
auto_join26
bucketmapjoin7
date_join1
join40
skewjoinopt2
vector_decimal_mapjoin
vector_mapjoin_reduce
auto_join_without_localtask
{noformat}
I will also create follow-up JIRAs for this issue.
3. for a few tests, the result is not the same. I don't know what exactly is 
the reason.
{noformat}
mapjoin1
vectorized_nested_mapjoin
{noformat}
4. for {{infer_bucket_sort_convert_join}}, Spark's output file has 0 for both 
{{numRows}} and {{rawDataSize}}.
We may need to investigate this.
5. for a lot of tests, MR results have warning like the following:
{noformat}
Warning: Map Join MAPJOIN[13][bigTable=b] in task 'Stage-2:MAPRED' is a cross 
product
{noformat}
for potential expensive join operator. Perhaps we should do something similar 
in the Spark branch?


> Turn on all join .q tests [Spark Branch]
> ----------------------------------------
>
>                 Key: HIVE-8422
>                 URL: https://issues.apache.org/jira/browse/HIVE-8422
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chao
>
> With HIVE-8412, all join queries should work on Spark, whether they require a 
> particular optimization or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to