[ 
https://issues.apache.org/jira/browse/HIVE-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8233:
-----------------------
    Description: 
Right now, for multi-table insertion, we will start from multiple 
FileSinkOperators, and break from their lowest common ancestor, adding 
temporary FileSinkOperator and TableScanOperators. A special case is when the 
LCA is a ForwardOperator, in which case we don't break it, since it's already 
been optimized.

However, there's a issue, considering the following plan:

{noformat}
      ...
       |
      FOR
       |
      RS_0
     /   \
   RS_1  RS_2
    |     |
   ...   ...
    |     |
   FS_1  FS_2
{noformat}

In this case, {{FOR}} is the LCA, and the plan will still be a single one. 
However, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}. Because of the issue in 
HIVE-7731 and HIVE-8118, both downstream branches will get duplicated (and 
same) results.

  was:
Right now, for multi-table insertion, we will start from multiple 
FileSinkOperators, and break from their lowest common ancestor, adding 
temporary FileSinkOperator and TableScanOperators. A special case is when the 
LCA is a ForwardOperator, in which case we don't break it, since it's already 
been optimized.

However, there's a issue, considering the following plan:

{noformat}
     ...
       |
      FOR
       |
      RS_0
     /   \
   RS_1  RS_2
    |     |
   ...   ...
    |     |
   FS_1  FS_2
{noformat}

In this case, {{FOR}} is the LCA, and the plan will still be a single one. 
However, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}. Because of the issue in 
HIVE-7731 and HIVE-8118, both downstream branches will get duplicated (and 
same) results.


> multi-table insertion doesn't work with ForwardOperator [Spark Branch]
> ----------------------------------------------------------------------
>
>                 Key: HIVE-8233
>                 URL: https://issues.apache.org/jira/browse/HIVE-8233
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chao
>
> Right now, for multi-table insertion, we will start from multiple 
> FileSinkOperators, and break from their lowest common ancestor, adding 
> temporary FileSinkOperator and TableScanOperators. A special case is when the 
> LCA is a ForwardOperator, in which case we don't break it, since it's already 
> been optimized.
> However, there's a issue, considering the following plan:
> {noformat}
>       ...
>        |
>       FOR
>        |
>       RS_0
>      /   \
>    RS_1  RS_2
>     |     |
>    ...   ...
>     |     |
>    FS_1  FS_2
> {noformat}
> In this case, {{FOR}} is the LCA, and the plan will still be a single one. 
> However, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}. Because of the issue 
> in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated (and 
> same) results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to