When I created the table, I had to reduce the orc.compress.size quite a bit to make my table with many columns work. This was on Hive 0.12 (I thought it was supposed to be fixed on Hive 0.13, but 3k+ columns is huge) The default of orc.compress size is quite a bit larger ( think in the 268k range) Try moving that smaller and smaller if that level doesn't work. Good luck.
STORED AS orc tblproperties ("orc.compress.size"="8192"); On Thu, May 15, 2014 at 8:11 PM, Premal Shah <premal.j.s...@gmail.com>wrote: > I have a table in hive stored as text file with 3283 columns. All columns > are of string data type. > > I'm trying to convert that table into an orc file table using this command > *create table orc_table stored as orc as select * from text_table;* > > This is the setting under mapred-site.xml > > <property> > <name>mapred.child.java.opts</name> > <value>-Xmx4G -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode > -verbose:gc -Xloggc:/mnt/hadoop/@taskid@.gc</value> > <final>true</final> > </property> > > The tasks die with this error > > 2014-05-16 00:53:42,424 FATAL org.apache.hadoop.mapred.Child: Error running > child : java.lang.OutOfMemoryError: Java heap space > at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39) > at java.nio.ByteBuffer.allocate(ByteBuffer.java:312) > at > org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117) > at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168) > at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthByteWriter.flush(RunLengthByteWriter.java:58) > at > org.apache.hadoop.hive.ql.io.orc.BitFieldWriter.flush(BitFieldWriter.java:44) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:553) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl$ListTreeWriter.writeStripe(WriterImpl.java:1455) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl.checkMemory(WriterImpl.java:221) > at > org.apache.hadoop.hive.ql.io.orc.MemoryManager.notifyWriters(MemoryManager.java:168) > at > org.apache.hadoop.hive.ql.io.orc.MemoryManager.addedRow(MemoryManager.java:157) > at > org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:2028) > at > org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:86) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > > > This is the GC output for a task that ran out of memory > > 0.690: [GC 17024K->768K(83008K), 0.0019170 secs] > 0.842: [GC 8488K(83008K), 0.0066800 secs] > 1.031: [GC 17792K->1481K(83008K), 0.0015400 secs] > 1.352: [GC 17142K(83008K), 0.0041840 secs] > 1.371: [GC 18505K->2249K(83008K), 0.0097240 secs] > 34.779: [GC 28384K(4177280K), 0.0014050 secs] > > > Anything I can tweak to make it work? > > -- > Regards, > Premal Shah. >