[ https://issues.apache.org/jira/browse/HIVE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-1997: --------------------------------- Component/s: Query Processor > Map join followed by multi-table insert will generate duplicated result > ----------------------------------------------------------------------- > > Key: HIVE-1997 > URL: https://issues.apache.org/jira/browse/HIVE-1997 > Project: Hive > Issue Type: Bug > Components: Query Processor > Reporter: Ted Xu > > Map join followed by multi-table insert will generate duplicated result, if > the insert targets contain both direct insert (FileSinkOperator logic) and > group by/distribute by (ReduceSinkOperator logic). > The following query regenerate the case: > {code} > FROM > (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, > y.value as value2 > FROM src1 x JOIN src y ON (x.key = y.key)) subq > INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-12') > SELECT key1, value1 > INSERT OVERWRITE TABLE destpart PARTITION (ds='2010-12-13') > SELECT key2, value2 > GROUP BY key2, value2; > {code} > In that query above, records of table destpart(ds='2010-12-12') is duplicated. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira