[ https://issues.apache.org/jira/browse/HIVE-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14258583#comment-14258583 ]
Hive QA commented on HIVE-9007: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12689071/HIVE-9007-spark.patch {color:red}ERROR:{color} -1 due to 152 failed/errored test(s), 7257 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_cast_constant org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_authorization_admin_almighty1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join24 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join27 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join31 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_without_localtask org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_column_access_stats org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_complex_types org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_complex_types_multi_single_reducer org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_position org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_skew_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_having org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join18_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join28 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join29 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join30 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join31 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join35 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join40 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_merge_multi_expressions org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part13 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_subquery org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_lateral_view org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_mixed org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_join_union org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel_join0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join_filter org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_transform org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_script_pipe org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_semijoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_noskew org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_union_remove_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoin_union_remove_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt11 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt20 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_skewjoinopt9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_temp_table org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_tez_join_tests org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_tez_joins_explain org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union25 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union33 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_10 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_18 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_20 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_24 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_25 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_6 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_cast_constant org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_left_outer_join org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_orderby_5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_nested_mapjoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_shufflejoin org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/588/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/588/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-588/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 152 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12689071 - PreCommit-HIVE-SPARK-Build > Hive may generate wrong plan for map join queries due to > IdentityProjectRemover [Spark Branch] > ---------------------------------------------------------------------------------------------- > > Key: HIVE-9007 > URL: https://issues.apache.org/jira/browse/HIVE-9007 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: spark-branch > Reporter: Chao > Assignee: Szehon Ho > Attachments: HIVE-9007-spark.patch > > > HIVE-8435 introduces a new logical optimizer called IdentityProjectRemover, > which may cause map join in spark branch to generate wrong plan. > Currently, the map join conversion in spark branch first goes through a > method {{convertJoinMapJoin}}, which replaces a join op with a mapjoin op, > removes RS associated with big table, and keep RSs for all small tables. > Afterwards, in {{SparkReduceSinkMapJoinProc}} it replaces all parent RSs of > the mapjoin op with HTS (note it doesn't check whether the RS belongs to > small table or big table.) > The issue arises, when IdentityProjectRemover comes into play, which may > result into a situation that a operator tree has two consecutive RSs. Imaging > the following example: > {noformat} > Join MapJoin > / \ / \ > RS RS ---> RS RS > / \ / \ > TS RS TS TS (big table) > \ (small table) > TS > {noformat} > In this case, all parents of the mapjoin op will be RS, even the branch for > big table! In {{SparkReduceSinkMapJoinProc}}, they will be replaced with HTS, > which is obviously incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)