[jira] Updated: (HIVE-1997) Map join followed by multi-table insert will generate duplicated result

Carl Steinbach (JIRA) Thu, 24 Feb 2011 19:35:08 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Carl Steinbach updated HIVE-1997:
---------------------------------

    Component/s: Query Processor

> Map join followed by multi-table insert will generate duplicated result
> -----------------------------------------------------------------------
>
>                 Key: HIVE-1997
>                 URL: https://issues.apache.org/jira/browse/HIVE-1997
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Ted Xu
>
> Map join followed by multi-table insert will generate duplicated result, if 
> the insert targets contain both direct insert (FileSinkOperator logic) and 
> group by/distribute by (ReduceSinkOperator logic).
> The following query regenerate the case:
> {code}
> FROM
> (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, 
> y.value as value2
>  FROM src1 x JOIN src y ON (x.key = y.key)) subq
> INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-12')
> SELECT key1, value1
> INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-13')
> SELECT key2, value2
> GROUP BY key2, value2;
> {code}
> In that query above, records of table destpart(ds='2010-12-12') is duplicated.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1997) Map join followed by multi-table insert will generate duplicated result

Reply via email to