Looks like you are running out of memory. Trying increasing the heap memory or 
reducing the stripe size. How many columns are you writing? Any idea how many 
record writers are open per map task?

- Prasanth

On Sep 22, 2015, at 4:32 AM, Patrick Duin 
<patd...@gmail.com<mailto:patd...@gmail.com>> wrote:

Hi all,

I am struggling trying to understand a stack trace I am getting trying to write 
an ORC file:
I am using hive-0.13.0/hadoop-2.4.0.


2015-09-21 09:15:44,603 INFO [main] org.apache.hadoop.mapred.MapTask: Ignoring 
exception during close for 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector@2ce49e21
java.lang.IllegalArgumentException: Column has wrong number of index entries 
found: org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry$Builder@6eeb967b 
expected: 1
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:578)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1398)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1990)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:774)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-09-21 09:15:45,988 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error 
running child : java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
        at 
org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117)
        at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168)
        at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:583)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780)
        at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
        at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)


I've seen https://issues.apache.org/jira/browse/HIVE-9080 and I think that 
might be related.

I am not using hive though I am using a Map only job that writes to an 
OrcNewOutputFormat.class.

Any pointers would be appreciated, anyone seen this before?



Thanks,

 Patrick

Reply via email to