Zhang Xinyu created HIVE-3326:
---------------------------------

             Summary: plan for multiple mapjoin followed by a normal join is 
wrong
                 Key: HIVE-3326
                 URL: https://issues.apache.org/jira/browse/HIVE-3326
             Project: Hive
          Issue Type: Bug
          Components: SQL
    Affects Versions: 0.8.1
         Environment: OS X 10.8; java 1.6.0_33
            Reporter: Zhang Xinyu


example queries:

create table yudi(c1 int, c2 int, c3 int, c4 int);
create table wangmu(c1 int, c2 int, c3 int, c4 int);
select /*+mapjoin(b,c)*/ * from yudi a join yudi b on a.c1=b.c1 join wangmu c 
on b.c2=c.c2 join yudi d on a.c3=d.c3;

in explain mode, I got this:

hive> explain select /*+mapjoin(b,c)*/ * from yudi a join yudi b on a.c1=b.c1 
join wangmu c on b.c2=c.c2 join yudi d on a.c3=d.c3;
OK
STAGE DEPENDENCIES:
  Stage-8 is a root stage
  Stage-2 depends on stages: Stage-8
  Stage-7 depends on stages: Stage-2
  Stage-3 depends on stages: Stage-7
  Stage-1 depends on stages: Stage-3

STAGE PLANS:
  Stage: Stage-8
    Map Reduce Local Work
      Alias -> Map Local Tables:
        b
        <Not Important>
  Stage: Stage-2
    Map Reduce
      Alias -> Map Operator Tree:
        a
        <Not Important>
      Local Work:
        Map Reduce Local Work

  Stage: Stage-7
    Map Reduce Local Work
      Alias -> Map Local Tables:
        c
        <Not Important>
  Stage: Stage-3
    Map Reduce
      Alias -> Map Operator Tree:
           
file:/var/folders/4w/3_nk1cwd4pd023mzx64p3r480000gn/T/dukezhang/hive_2012-08-01_14-01-37_152_5814747325029961632/-mr-10002
        <Not Important>
      Local Work:
        Map Reduce Local Work

  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        d
          TableScan

        
file:/var/folders/4w/3_nk1cwd4pd023mzx64p3r480000gn/T/dukezhang/hive_2012-08-01_14-01-37_152_5814747325029961632/-mr-10002
          Select Operator

      Reduce Operator Tree:
      <Not Important>

You see, mapper of Stage-1 should read from Stage-3, maybe '.../-mr-10003', not 
Stage-2(result in '.../-mr-10002').

To resolve this bug, I found these codes(GenMapRedUtils.java, about line 431):
        if (oldMapJoin == null) {
          if 
(opProcCtx.getParseCtx().getListMapJoinOpsNoReducer().contains(mjOp)
              || local || (oldTask != null) && (parTasks != null)) {
            taskTmpDir = mjCtx.getTaskTmpDir();
            tt_desc = mjCtx.getTTDesc();
            rootOp = mjCtx.getRootMapJoinOp();
          }
        } else {
          GenMRMapJoinCtx oldMjCtx = opProcCtx.getMapJoinCtx(oldMapJoin);
          assert oldMjCtx != null;
          taskTmpDir = oldMjCtx.getTaskTmpDir();
          tt_desc = oldMjCtx.getTTDesc();
          rootOp = oldMjCtx.getRootMapJoinOp();
        }
my query goes into 'else' block and gets wrong taskTmpDir. I hack them to let 
query go into 'if' block, and it works.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to