Sergey Shelukhin created HIVE-6682:
--------------------------------------

             Summary: nonstaged mapjoin table memory check may be broken
                 Key: HIVE-6682
                 URL: https://issues.apache.org/jira/browse/HIVE-6682
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.13.0
            Reporter: Sergey Shelukhin
            Assignee: Sergey Shelukhin


We are getting the below error from task while the staged load works correctly. 
We don't set the memory threshold so low so it seems the settings are just not 
handled correctly. This seems to always trigger on the first check. Given that 
map task might have bunch more stuff, not just the hashmap, we may also need to 
adjust the memory check (e.g. have separate configs).

{noformat}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21       Processing rows:        200000  Hashtable size: 
199999  Memory usage:   204001888       percentage:     0.197
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21       Processing rows:        200000  Hashtable size: 
199999  Memory usage:   204001888       percentage:     0.197
        at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:104)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:150)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:165)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
        at 
org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
        at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
        at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
        ... 8 more
Caused by: 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 
2014-03-14 08:11:21 Processing rows:        200000  Hashtable size: 199999  
Memory usage:   204001888       percentage:     0.197
        at 
org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:91)
        at 
org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:248)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:791)
        at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
        at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:375)
        at 
org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:346)
        at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:147)
        at 
org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:82)
        ... 15 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to