[ https://issues.apache.org/jira/browse/HIVE-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aniket Mokashi reassigned HIVE-17448: ------------------------------------- Assignee: Aniket Mokashi > ArrayIndexOutOfBoundsException on ORC tables after adding a struct field > ------------------------------------------------------------------------ > > Key: HIVE-17448 > URL: https://issues.apache.org/jira/browse/HIVE-17448 > Project: Hive > Issue Type: Bug > Components: ORC > Affects Versions: 2.1.1 > Environment: Reproduced on Dataproc 1.1, 1.2 (Hive 2.1). > Reporter: Nikolay Sokolov > Assignee: Aniket Mokashi > Priority: Minor > Attachments: HIVE-17448.1-branch-2.1.patch > > > When ORC files have been created with older schema, which had smaller set of > struct fields, and schema have been changed to one with more struct fields, > and there are sibling fields of struct type going after struct itself, > ArrayIndexOutOfBoundsException is being thrown. Steps to reproduce: > {code:none} > create external table test_broken_struct(a struct<f1:int, f2:int>, b int) > stored as orc; > insert into table test_broken_struct > select named_struct("f1", 1, "f2", 2), 3; > drop table test_broken_struct; > create external table test_broken_struct(a struct<f1:int, f2:int, f3:int>, b > int) stored as orc; > select * from test_broken_struct; > {code} > Same scenario is not causing crash on hive 0.14. > Debug log and stack trace: > {code:none} > 2017-09-07T00:21:40,266 INFO [main] orc.OrcInputFormat: Using schema > evolution configuration variables schema.evol > ution.columns [a, b] / schema.evolution.columns.types > [struct<f1:int,f2:int,f3:int>, int] (isAcidRead false) > 2017-09-07T00:21:40,267 DEBUG [main] orc.OrcInputFormat: No ORC pushdown > predicate > 2017-09-07T00:21:40,267 INFO [main] orc.ReaderImpl: Reading ORC rows from > hdfs://cluster-7199-m/user/hive/warehous > e/test_broken_struct/000000_0 with {include: [true, true, true, true, true], > offset: 3, length: 159, schema: struct > <a:struct<f1:int,f2:int,f3:int>,b:int>} > Failed with exception > java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 5 > 2017-09-07T00:21:40,273 ERROR [main] CliDriver: Failed with exception > java.io.IOException:java.lang.ArrayIndexOutOf > BoundsException: 5 > java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 5 > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2098) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.ArrayIndexOutOfBoundsException: 5 > at > org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:195) > at > org.apache.orc.impl.SchemaEvolution.buildConversionFileTypesArray(SchemaEvolution.java:253) > at org.apache.orc.impl.SchemaEvolution.<init>(SchemaEvolution.java:59) > at > org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:149) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:63) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:314) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.<init>(OrcInputFormat.java:225) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1691) > at > org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:69 > 5) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:333) > at > org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) > ... 15 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)