[ 
https://issues.apache.org/jira/browse/HIVE-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket Mokashi reassigned HIVE-17448:
-------------------------------------

    Assignee: Aniket Mokashi

> ArrayIndexOutOfBoundsException on ORC tables after adding a struct field
> ------------------------------------------------------------------------
>
>                 Key: HIVE-17448
>                 URL: https://issues.apache.org/jira/browse/HIVE-17448
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.1.1
>         Environment: Reproduced on Dataproc 1.1, 1.2 (Hive 2.1).
>            Reporter: Nikolay Sokolov
>            Assignee: Aniket Mokashi
>            Priority: Minor
>         Attachments: HIVE-17448.1-branch-2.1.patch
>
>
> When ORC files have been created with older schema, which had smaller set of 
> struct fields, and schema have been changed to one with more struct fields, 
> and there are sibling fields of struct type going after struct itself, 
> ArrayIndexOutOfBoundsException is being thrown. Steps to reproduce:
> {code:none}
> create external table test_broken_struct(a struct<f1:int, f2:int>, b int) 
> stored as orc;
> insert into table test_broken_struct 
>     select named_struct("f1", 1, "f2", 2), 3;
> drop table test_broken_struct;
> create external table test_broken_struct(a struct<f1:int, f2:int, f3:int>, b 
> int) stored as orc;
> select * from test_broken_struct;
> {code}
> Same scenario is not causing crash on hive 0.14.
> Debug log and stack trace:
> {code:none}
> 2017-09-07T00:21:40,266  INFO [main] orc.OrcInputFormat: Using schema 
> evolution configuration variables schema.evol
> ution.columns [a, b] / schema.evolution.columns.types 
> [struct<f1:int,f2:int,f3:int>, int] (isAcidRead false)
> 2017-09-07T00:21:40,267 DEBUG [main] orc.OrcInputFormat: No ORC pushdown 
> predicate
> 2017-09-07T00:21:40,267  INFO [main] orc.ReaderImpl: Reading ORC rows from 
> hdfs://cluster-7199-m/user/hive/warehous
> e/test_broken_struct/000000_0 with {include: [true, true, true, true, true], 
> offset: 3, length: 159, schema: struct
> <a:struct<f1:int,f2:int,f3:int>,b:int>}
> Failed with exception 
> java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 5
> 2017-09-07T00:21:40,273 ERROR [main] CliDriver: Failed with exception 
> java.io.IOException:java.lang.ArrayIndexOutOf
> BoundsException: 5
> java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 5
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)
>         at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
>         at 
> org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:195)
>         at 
> org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:253)
>         at org.apache.orc.impl.SchemaEvolution.<init>(SchemaEvolution.java:59)
>         at 
> org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:149)
>         at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:63)
>         at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.<init>(OrcInputFormat.java:225)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:69
> 5)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333)
>         at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459)
>         ... 15 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to