Bryan, This might be a bug. Are you testing with sample data? Will it be possible to send me the values in col_b? Alternatively, you can file a bug and upload the data for failing column. I can take a look at it.
There is another bug that was opened recently. https://issues.apache.org/jira/browse/HIVE-6382 You can try applying patch from HIVE-6382 on top of HIVE-5991 and see if it solves the issue. HIVE-6382 added some protection against integer overflows. Let me know if it helps. Thanks Prasanth Jayachandran On Feb 12, 2014, at 3:46 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > Prasanth, > > I applied the patched Hive and restarted Hadoop/Hive. I selected from a text > table into a newly created ORC table, and found the same errors. Perhaps > this is a different problem? > > hive> select min(col_b), max(col_b), count(col_b) from text_data_range_572643; > 46 411592614 572643 > > When selecting from the ORC table, the following query fails: > hive> select col_b from orc_test; > > With the errors seen previously: > > Error: java.io.IOException: java.io.IOException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) > Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302) > ... 11 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readPatchedBaseValues(RunLengthIntegerReaderV2.java:171) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:473) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1157) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2196) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:106) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > ... 15 more > > col_b is a bigint. > > Thoughts? > > Regards, > > Bryan Jeffrey > > > On Wed, Feb 12, 2014 at 2:47 PM, Prasanth Jayachandran > <pjayachand...@hortonworks.com> wrote: > Hi Bryan > > HIVE-5991 is a writer bug. From the exception I see that exception happens > while reading the ORC file. It might happen that its trying to read a > corrupted ORC file. > Are you trying to read the old ORC file after applying this patch? Since it > is a writer bug, the ORC file needs to be regenerated. Can you try > regenerating the file and see if it happens again? > > Thanks > Prasanth Jayachandran > > On Feb 12, 2014, at 11:35 AM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote: > >> Hello. >> >> I am running Hive 0.12.0 & Hadoop 2.2.0. I attempted to apply the fix >> described in the patch here: >> https://issues.apache.org/jira/secure/attachment/12617931/HIVE-5991.1.patch >> >> I applied the patch, and ran 'ant tar' from the src directory. A tar file >> for distribution was created, and it appeared that items recompiled. >> However, I am still seeing the following errors: >> >> Error: java.io.IOException: java.io.IOException: >> java.lang.ArrayIndexOutOfBoundsException: 0 >> at >> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) >> at >> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) >> at >> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304) >> at >> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220) >> at >> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197) >> at >> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) >> Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0 >> at >> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) >> at >> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) >> at >> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) >> at >> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) >> at >> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) >> at >> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) >> at >> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302) >> ... 11 more >> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 >> at >> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readPatchedBaseValues(RunLengthIntegerReaderV2.java:171) >> at >> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54) >> at >> org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287) >> at >> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:473) >> at >> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1157) >> at >> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2196) >> at >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:106) >> at >> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57) >> at >> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) >> ... 15 more >> >> >> I believe that the symptoms match the ticket (#5991). I altered the 'int' >> to 'long' as described in the patch. Did I miss a step when recompiling? >> >> Regards, >> >> Bryan Jeffrey > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader of > this message is not the intended recipient, you are hereby notified that any > printing, copying, dissemination, distribution, disclosure or forwarding of > this communication is strictly prohibited. If you have received this > communication in error, please contact the sender immediately and delete it > from your system. Thank You. > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.