When we try to join two large tables some of the reducers stop with an OutOfMemory exception.

Error: java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1408) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)


When looking at garbage collection for these reduce tasks it's continually doing garbage collections.
Like this:
2011-02-17T14:36:08.295+0100: 1250.547: [Full GC [PSYoungGen: 111055K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)] 809466K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1496600 secs] [Times: user=1.08 sys=0.00, real=0.15 secs] 2011-02-17T14:36:08.600+0100: 1250.851: [Full GC [PSYoungGen: 111057K->53660K(233024K)] [ParOldGen: 698410K->698410K(699072K)] 809468K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1360010 secs] [Times: user=1.00 sys=0.01, real=0.13 secs] 2011-02-17T14:36:08.915+0100: 1251.167: [Full GC [PSYoungGen: 111058K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)] 809468K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1325960 secs] [Times: user=0.94 sys=0.00, real=0.14 secs] 2011-02-17T14:36:09.205+0100: 1251.457: [Full GC [PSYoungGen: 111055K->53659K(233024K)] [ParOldGen: 698410K->698410K(699072K)] 809466K->752070K(932096K) [PSPermGen: 14450K->14450K(21248K)], 0.1301610 secs] [Times: user=0.99 sys=0.00, real=0.13 secs]


“mapred.child.java.opts” set to “-Xmx1024M -XX:+UseCompressedOops -XX:+UseParallelOldGC -XX:+UseNUMA -Djava.net.preferIPv4Stack=true -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails -Xloggc:/opt/hadoop/logs/task_@tas...@.gc.log”

I've been reducing this parameter “hive.exec.reducers.bytes.per.reducer” to as low as 200M but I still get the OutOfMemory errors. I would have expected this would drop the amount of data send to the reducers and thus not have the OutOfMemory errors to happen.

Any idea's on why this happens?

I'm using a trunk build from around 2011-02-03

Reply via email to