Hi all,

I'm trying to run a map reduce job to convert csv data into orc using the
OrcNewOutputFormat (reduce is required to satisfy some partitioning logic)
but getting an OOM error at reduce phase (during merge to be exact) with
the below attached stacktrace for one particular table which has about 800
columns and this error seems common across all reducers(minimum reducer
input records is about 20, max. is about 100 mil). I am trying to figure
out the exact cause of the error since I have use the same job to convert
tables with 100-10000 columns without any memory or config changes.

What concerns me in the stack trace is this line:

        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.writeMetadata(WriterImpl.java:2327)

Why is it going OOM while trying to write MetaData ?

I originally believed this was simply due to the number of open buffers (as
mentioned in
http://mail-archives.apache.org/mod_mbox/hive-dev/201410.mbox/%3c543d5eb6.2000...@apache.org%3E).So
I wrote a bit of code to reproduce the error on my local setup by creating
an instance of OrcRecordWriter and writing large number of columns, I did
get a similar heap space error, however it was going OOM while trying to
flush the stripes, with this in the stacktrace:

at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:2133)

This issue on the dev environment got resolved by setting

hive.exec.orc.default.buffer.size=32k

Will the same setting work for the original error?

For different reasons I cannot change the reducer memory or lower the
buffer size even at a job level. For now, I am just trying to understand
the source of this error. Can anyone please help?

Original OOM stacktrace:

FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child :
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
        at 
org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107)
        at org.apache.hadoop.hive.ql.io.orc.OutStream.write(OutStream.java:140)
        at 
com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833)
        at 
com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.writeMetadata(WriterImpl.java:2327)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2426)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
        at 
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Reply via email to