[jira] [Updated] (HIVE-15443) There is a table stored as orc format, it contains a column with the type of array.When each cell of this column contains tens of strings, the queries reported ArrayIndexOutOfBoundsException.

mortalee (JIRA) Thu, 15 Dec 2016 19:55:35 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-15443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


mortalee updated HIVE-15443:
----------------------------
    Attachment: orc_array.patch

> There is a table stored as orc format, it contains a column with the type of 
> array<string>.When each cell of this column contains tens of strings, the 
> queries reported ArrayIndexOutOfBoundsException.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-15443
>                 URL: https://issues.apache.org/jira/browse/HIVE-15443
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.1.0
>         Environment: centos+hive2.1.0+hadoop2.7.2
>            Reporter: mortalee
>              Labels: patch
>         Attachments: orc_array.patch
>
>
> java.lang.Exception: java.io.IOException: java.io.IOException: 
> java.lang.ArrayIndexOutOfBoundsException: 1024
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.io.IOException: java.io.IOException: 
> java.lang.ArrayIndexOutOfBoundsException: 1024
>       at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>       at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>       at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:230)
>       at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>       at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>       at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 1024
>       at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>       at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>       at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
>       at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:106)
>       at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
>       at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>       at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:228)
>       ... 12 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
>       at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
>       at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
>       at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
>       at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
>       at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
>       at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
>       at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
>       at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
>       at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
>       at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
>       at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
>       at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
>       ... 16 more
> The reason I found is 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader's 
> private final LongColumnVector scratchlcv is defined with the default length 
> 1024, but before reading data, scratchlcv.ensureSize is missing.
> ps:
> create table test_orc_array(names array<string>) stored as orc;
> select names from test_orc_array group by names limit 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15443) There is a table stored as orc format, it contains a column with the type of array.When each cell of this column contains tens of strings, the queries reported ArrayIndexOutOfBoundsException.

Reply via email to