When we try to join two large tables some of the reducers stop with an
OutOfMemory exception.
Error: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
When looking at garbage collection for these reduce tasks it's
continually doing garbage collections.
Like this:
2011-02-17T14:36:08.295+0100: 1250.547: [Full GC [PSYoungGen:
111055K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)]
809466K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1496600
secs] [Times: user=1.08 sys=0.00, real=0.15 secs]
2011-02-17T14:36:08.600+0100: 1250.851: [Full GC [PSYoungGen:
111057K->53660K(233024K)] [ParOldGen: 698410K->698410K(699072K)]
809468K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1360010
secs] [Times: user=1.00 sys=0.01, real=0.13 secs]
2011-02-17T14:36:08.915+0100: 1251.167: [Full GC [PSYoungGen:
111058K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)]
809468K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1325960
secs] [Times: user=0.94 sys=0.00, real=0.14 secs]
2011-02-17T14:36:09.205+0100: 1251.457: [Full GC [PSYoungGen:
111055K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)]
809466K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1301610
secs] [Times: user=0.99 sys=0.00, real=0.13 secs]
“mapred.child.java.opts” set to “-Xmx1024M -XX:+UseCompressedOops
-XX:+UseParallelOldGC -XX:+UseNUMA -Djava.net.preferIPv4Stack=true
-verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails
-Xloggc:/opt/hadoop/logs/task_@tas...@.gc.log”
I've been reducing this parameter “hive.exec.reducers.bytes.per.reducer”
to as low as 200M but I still get the OutOfMemory errors. I would have
expected this would drop the amount of data send to the reducers and
thus not have the OutOfMemory errors to happen.
Any idea's on why this happens?
I'm using a trunk build from around 2011-02-03