Hive version is 1.2.1000.2.6.1.0-0129 ( HDP 2.6.1.0) For now I have mitigated the problem by recreating the table. So, I don't have the relevant ORC files right now.
Also, I am curious, how would "*hive.acid.key.index*" help in debugging this problem ? I was going through the source code and it seems the following line is the problem: /** * Find the key range for bucket files. * @param reader the reader * @param options the options for reading with * @throws IOException */ private void discoverKeyBounds(Reader reader, Reader.Options options) throws IOException { RecordIdentifier[] keyIndex = OrcRecordUpdater.parseKeyIndex(reader); long offset = options.getOffset(); long maxOffset = options.getMaxOffset(); int firstStripe = 0; int stripeCount = 0; boolean isTail = true; List<StripeInformation> stripes = reader.getStripes(); for(StripeInformation stripe: stripes) { if (offset > stripe.getOffset()) { firstStripe += 1; } else if (maxOffset > stripe.getOffset()) { stripeCount += 1; } else { isTail = false; break; } } if (firstStripe != 0) { minKey = keyIndex[firstStripe - 1]; } if (!isTail) { maxKey = keyIndex[firstStripe + stripeCount - 1]; } } If this is still an open issue I would like to submit a patch to it. Let me know how can I further debug this issue. Thanks, Aviral Agarwal On Feb 15, 2018 23:10, "Eugene Koifman" <ekoif...@hortonworks.com> wrote: > What version of Hive is this? > > > > Can you isolate this to a specific partition? > > > > The table/partition you are reading should have a directory called base_x/ > with several bucket_0000N files. (if you see more than 1 base_x, take one > with highest x) > > > > Each bucket_0000N should have a “*hive.acid.key.index*” property in user > metadata section of ORC footer. > > Could you share the value of this property? > > > > You can use orcfiledump (https://cwiki.apache.org/conf > luence/display/Hive/LanguageManual+ORC#LanguageManualORC-ORC > FileDumpUtility) for this but it requires https://issues.apache.org/jira > /browse/ORC-223. > > > > Thanks, > > Eugene > > > > > > *From: *Aviral Agarwal <aviral12...@gmail.com> > *Reply-To: *"user@hive.apache.org" <user@hive.apache.org> > *Date: *Thursday, February 15, 2018 at 2:08 AM > *To: *"user@hive.apache.org" <user@hive.apache.org> > *Subject: *ORC ACID table returning Array Index Out of Bounds > > > > Hi guys, > > > > I am running into the following error when querying a ACID table : > > > > Caused by: java.lang.RuntimeException: java.io.IOException: > java.lang.ArrayIndexOutOfBoundsException: 8 > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196) > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:135) > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:101) > > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149) > > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80) > > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674) > > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633) > > at > org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145) > > at > org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109) > > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405) > > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124) > > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149) > > ... 14 more > > Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 8 > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) > > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) > > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:253) > > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193) > > ... 25 more > > Caused by: java.lang.ArrayIndexOutOfBoundsException: 8 > > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.discoverKeyBounds(OrcRawRecordMerger.java:378) > > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:447) > > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1436) > > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1323) > > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251) > > ... 26 more > > > > > Any help would be appreciated. > > > Regards, > > Aviral Agarwal >