[ https://issues.apache.org/jira/browse/HIVE-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13727696#comment-13727696 ]
Hudson commented on HIVE-4952: ------------------------------ SUCCESS: Integrated in Hive-trunk-h0.21 #2239 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2239/]) HIVE-4952 : When hive.join.emit.interval is small, queries optimized by Correlation Optimizer may generate wrong results (Yin Huai via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1509542) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DemuxOperator.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/QueryPlanTreeTransformation.java * /hive/trunk/ql/src/test/queries/clientpositive/correlationoptimizer15.q * /hive/trunk/ql/src/test/results/clientpositive/correlationoptimizer15.q.out > When hive.join.emit.interval is small, queries optimized by Correlation > Optimizer may generate wrong results > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-4952 > URL: https://issues.apache.org/jira/browse/HIVE-4952 > Project: Hive > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Yin Huai > Assignee: Yin Huai > Fix For: 0.12.0 > > Attachments: HIVE-4952.D11889.1.patch, HIVE-4952.D11889.2.patch, > replay.txt > > > If we have a query like this ... > {code:sql} > SELECT xx.key, xx.cnt, yy.key > FROM > (SELECT x.key as key, count(1) as cnt FROM src1 x JOIN src1 y ON (x.key = > y.key) group by x.key) xx > JOIN src yy > ON xx.key=yy.key; > {\code} > After Correlation Optimizer, the operator tree in the reducer will be > {code} > JOIN2 > | > | > MUX > / \ > / \ > GBY | > | | > JOIN1 | > \ / > \ / > DEMUX > {\code} > For JOIN2, the right table will arrive at this operator first. If > hive.join.emit.interval is small, e.g. 1, JOIN2 will output the results even > it has not got any row from the left table. The logic related > hive.join.emit.interval in JoinOperator assumes that inputs will be ordered > by the tag. But, if a query has been optimized by Correlation Optimizer, this > assumption may not hold for those JoinOperators inside the reducer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira