[ 
https://issues.apache.org/jira/browse/HIVE-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8207:
-----------------------
    Attachment: HIVE-8207.2-spark.patch

I modified some qfiles by adding {{SORT_QUERY_RESULTS}}, and thus
some result files for hive need to be changed too.

> Add .q tests for multi-table insertion [Spark Branch]
> -----------------------------------------------------
>
>                 Key: HIVE-8207
>                 URL: https://issues.apache.org/jira/browse/HIVE-8207
>             Project: Hive
>          Issue Type: Test
>          Components: Spark
>            Reporter: Chao
>            Assignee: Chao
>         Attachments: HIVE-8207.1-spark.patch, HIVE-8207.2-spark.patch
>
>
> Now that multi-table insertion is committed to branch, we should enable those 
> related qtests.
> Here is a list of qfiles that should be activated (some of them may already 
> be activated).
> The list may not be comprehensive.
> {noformat}
> add_part_multiple.q
> auto_smb_mapjoin_14.q
> bucket5.q
> column_access_stats.q
> date_udf.q
> groupby10.q
> groupby11.q
> groupby3_map_multi_distinct.q
> groupby3_map.q
> groupby3_map_skew.q
> groupby3_noskew_multi_distinct.q
> groupby3_noskew.q
> groupby7_map_multi_single_reducer.q
> groupby7_map.q
> groupby7_map_skew.q
> groupby7_noskew_multi_single_reducer.q
> groupby7_noskew.q
> groupby7.q
> groupby8_map.q
> groupby8_map_skew.q
> groupby8_noskew.q
> groupby8.q
> groupby9.q
> groupby_complex_types_multi_single_reducer.q
> groupby_complex_types.q
> groupby_cube1.q
> groupby_map_ppr_multi_distinct.q
> groupby_map_ppr.q
> groupby_multi_insert_common_distinct.q
> groupby_multi_single_reducer2.q
> groupby_multi_single_reducer3.q
> groupby_multi_single_reducer.q
> groupby_position.q
> groupby_ppr.q
> groupby_rollup1.q
> groupby_sort_1_23.q
> groupby_sort_1.q
> groupby_sort_skew_1_23.q
> infer_bucket_sort_multi_insert.q
> innerjoin.q
> input12_hadoop20.q
> input12.q
> input13.q
> input14.q
> input17.q
> input18.q
> input1_limit.q
> input_part2.q
> insert_into3.q
> join_nullsafe.q
> load_dyn_part8.q
> metadata_only_queries_with_filters.q
> multigroupby_singlemr.q
> multi_insert_gby2.q
> multi_insert_gby3.q
> multi_insert_gby.q
> multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q
> multi_insert.q
> parallel.q
> partition_date2.q
> pcr.q
> ppd_multi_insert.q
> ppd_transform.q
> smb_mapjoin_11.q
> smb_mapjoin_12.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> smb_mapjoin_16.q
> stats4.q
> subquery_multiinsert.q
> table_access_keys_stats.q
> tez_dml.q
> udaf_percentile_approx_20.q
> udaf_percentile_approx_23.q
> union17.q
> union18.q
> union19.q
> {noformat}                                                                    
>           
> There are some tests that cannot be enabled right now, due to various reasons:
> 1. ForwardOperator Issue, including
> {noformat}
> groupby7_noskew_multi_single_reducer.q
> groupby8_map.q
> groupby8_map_skew.q
> groupby8_noskew.q
> groupby8.q
> groupby9.q
> groupby10.q
> groupby_multi_insert_common_distinct.q 
> union17.q
> {noformat}
> *Reason*: currently, if the node to break in the operator tree is a 
> ForwardOperator, we simple do nothing. However, we may have the following 
> case:
> {noformat}
>       ...
>       RS_0
>        |
>       FOR
>        |
>      /   \
>    GBY_1  GBY_2
>     |     |
>    ...   ...
>     |     |
>    RS_1  RS_2
>     |     |
>    ...   ...
>     |     |
>    FS_1  FS_2
> {noformat}
> which may result to:
> {noformat}
>           RW
>          /  \
>        RW    RW
> {noformat}
> and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches 
> will get duplicated (and same) inputs.
> 2. Stats issue, including:
> {noformat}
> bucket5.q
> infer_bucket_sort_multi_insert.q
> stats4.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> {noformat}
> *Reason*: In these tests, I get diff error because {{numRows}} and 
> {{rawDataSize}} are -1, but they are expected to be some positive value. I 
> don't think this is related to multi-insertion.
> 3. Join/SMB Join Issue, including
> {noformat}
> auto_smb_mapjoin_14.q
> auto_sortmerge_join_13.q
> smb_mapjoin_11.q
> smb_mapjoin_12.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> smb_mapjoin_16.q
> {noformat}
> *Reason*: These tests either failed with exception or failed with diff. I 
> think it's because SMB Join (HIVE-8202) isn't supported right now.
> 4. Result doesn't match, including
> {noformat}
> groupby3_map_skew.q
> groupby_map_ppr_multi_distinct.q
> groupby_complex_types_multi_single_reducer.q
> groupby_map_ppr.q
> partition_date2.q
> udaf_percentile_approx_23.q
> {noformat}
> *Reason*: The results from these tests are different from MR's. For instance, 
> test for groupby3_map_skew.q failed because:
> {noformat}
> < 130091.0      260.182 256.10355987055016      98.0    0.0     
> 142.92680950752379      143.06995106518903      20428.07288     20469.0109
> ---
> > 130091.0      260.182 256.10355987055016      98.0    0.0     
> > 142.9268095075238       143.06995106518906      20428.07288     20469.0109
> {noformat}
> I don't know why this will happen. But, I think they may not be related to 
> multi-insertion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to