Re: Hive 0.13.0 - IndexOutOfBounds Exception

Prasanth Jayachandran Tue, 22 Apr 2014 11:00:35 -0700

Thanks Bryan. This is more than sufficient. As a workaround, can you try 
setting hive.optimize.sort.dynamic.partition=false and see if it helps? In the 
meantime, I will diagnose the issue.


Thanks
Prasanth Jayachandran

On Apr 22, 2014, at 10:36 AM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote:

> Prasanth,
> 
> Was this additional information sufficient?  This is a large road block to 
> our adopting Hive 0.13.0.
> 
> Regards,
> 
> Bryan Jeffrey
> 
> 
> On Tue, Apr 22, 2014 at 7:41 AM, Bryan Jeffrey <bryan.jeff...@gmail.com> 
> wrote:
> Prasanth,
> 
> The error seems to occur with just about any table.  I mocked up a very 
> simple table to illustrate the problem (including input data, etc.) to make 
> this easy to repeat.
> 
> hive> create table loading_data_0 (A smallint, B smallint) partitioned by 
> (range int) row format delimited fields terminated by '|' stored as textfile;
> hive> create table data (A smallint, B smallint) partitioned by (range int) 
> clustered by (A) sorted by (A, B) into 8 buckets stored as orc tblproperties 
> (\"orc.compress\" = \"SNAPPY\", \"orc.index\" = \"true\");
> [root@server ~]# cat test.input
> 123|436
> 423|426
> 223|456
> 923|486
> 023|406
> hive> load data inpath '/test.input' into table loading_data_0 partition 
> (range=123);
> 
> [root@server scripts]# hive -e "describe data;"
> Logging initialized using configuration in 
> /opt/hadoop/latest-hive/conf/hive.log4j
> OK
> Time taken: 0.508 seconds
> OK
> a                       smallint
> b                       smallint
> range                   int
> 
> # Partition Information
> # col_name              data_type               comment
> 
> range                   int
> Time taken: 0.422 seconds, Fetched: 8 row(s)
> [root@server scripts]# hive -e "describe loading_data_0;"
> Logging initialized using configuration in 
> /opt/hadoop/latest-hive/conf/hive.log4j
> OK
> Time taken: 0.511 seconds
> OK
> a                       smallint
> b                       smallint
> range                   int
> 
> # Partition Information
> # col_name              data_type               comment
> 
> range                   int
> Time taken: 0.37 seconds, Fetched: 8 row(s)
> 
> 
> [root@server scripts]# hive -e "set 
> hive.exec.dynamic.partition.mode=nonstrict; set hive.enforce.sorting = true; 
> set mapred.job.queue.name=orc_queue; explain insert into table data partition 
> (range) select * from loading_data_0;"
> Logging initialized using configuration in 
> /opt/hadoop/latest-hive/conf/hive.log4j
> OK
> Time taken: 0.564 seconds
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> 
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: loading_data_0
>             Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>             Select Operator
>               expressions: a (type: smallint), b (type: smallint), range 
> (type: int)
>               outputColumnNames: _col0, _col1, _col2
>               Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>               Reduce Output Operator
>                 key expressions: _col2 (type: int), -1 (type: int), _col0 
> (type: smallint), _col1 (type: smallint)
>                 sort order: ++++
>                 Map-reduce partition columns: _col2 (type: int)
>                 Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>                 value expressions: _col0 (type: smallint), _col1 (type: 
> smallint), _col2 (type: int)
>       Reduce Operator Tree:
>         Extract
>           Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE Column 
> stats: NONE
>           File Output Operator
>             compressed: false
>             Statistics: Num rows: 5 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>             table:
>                 input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 output format: 
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>                 serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>                 name: data
> 
>   Stage: Stage-0
>     Move Operator
>       tables:
>           partition:
>             range
>           replace: false
>           table:
>               input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>               output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>               serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>               name: data
> 
> Time taken: 0.913 seconds, Fetched: 45 row(s)
> 
> 
> 
>  [root@server]# hive -e "set hive.exec.dynamic.partition.mode=nonstrict; set 
> hive.enforce.sorting = true; set mapred.job.queue.name=orc_queue; insert into 
> table data partition (range) select * from loading_data_0;"
> Logging initialized using configuration in 
> /opt/hadoop/latest-hive/conf/hive.log4j
> OK
> Time taken: 0.513 seconds
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Starting Job = job_1398130933303_1467, Tracking URL = 
> http://server:8088/proxy/application_1398130933303_1467/
> Kill Command = /opt/hadoop/latest-hadoop/bin/hadoop job  -kill 
> job_1398130933303_1467
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2014-04-22 11:33:26,984 Stage-1 map = 0%,  reduce = 0%
> 2014-04-22 11:33:51,833 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1398130933303_1467 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1398130933303_1467_m_000000 (and more) from job 
> job_1398130933303_1467
> 
> Task with the most failures(4):
> -----
> Task ID:
>   task_1398130933303_1467_m_000000
> 
> URL:
>   
> http://server:8088/taskdetails.jsp?jobid=job_1398130933303_1467&tipid=task_1398130933303_1467_m_000000
> -----
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"a":123,"b":436,"range":123}
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"a":123,"b":436,"range":123}
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>         ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:327)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
>         at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
>         at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
>         ... 9 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>         at java.util.ArrayList.get(ArrayList.java:322)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:121)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.<init>(StandardStructObjectInspector.java:109)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:283)
>         at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:268)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:251)
>         at 
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:264)
>         ... 15 more
> 
> Container killed by the ApplicationMaster.
> Container killed on request. Exit code is 143
> Container exited with a non-zero exit code 143
> 
> 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched:
> Job 0: Map: 1  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> 
> Does that help?  I took a quick look at ReduceSinkOperator, but was unable to 
> put my finger on the issue.
> 
> Regards,
> 
> Bryan Jeffrey
> 
> 
> 
> On Mon, Apr 21, 2014 at 10:55 PM, Prasanth Jayachandran 
> <pjayachand...@hortonworks.com> wrote:
> Hi Bryan
> 
> Can you provide more information about the input and output tables? Schema? 
> Partitioning and bucketing information? Explain plan of your insert query?
> 
> These information will help to diagnose the issue.
> 
> Thanks
> Prasanth
> 
> Sent from my iPhone
> 
> > On Apr 21, 2014, at 7:00 PM, Bryan Jeffrey <bryan.jeff...@gmail.com> wrote:
> >
> > Hello.
> >
> > I am running Hadoop 2.4.0 and Hive 0.13.0.  I am encountering the following 
> > error when converting a text table to ORC via the following command:
> >
> > Error:
> >
> > Diagnostic Messages for this Task:
> > Error: java.lang.RuntimeException: 
> > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> > processing row { - Removed -}
> >         at 
> > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:195)
> >         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> >         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
> >         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at 
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> >         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> > Error while processing row { - Removed -}
> >         at 
> > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
> >         at 
> > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
> >         ... 8 more
> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
> >         at 
> > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:327)
> >         at 
> > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> >         at 
> > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87)
> >         at 
> > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> >         at 
> > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
> >         at 
> > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
> >         at 
> > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
> >         ... 9 more
> > Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
> >         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> >         at java.util.ArrayList.get(ArrayList.java:322)
> >         at 
> > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:121)
> >         at 
> > org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.<init>(StandardStructObjectInspector.java:109)
> >         at 
> > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:283)
> >         at 
> > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:268)
> >         at 
> > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:251)
> >         at 
> > org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:264)
> >         ... 15 more
> >
> > Container killed by the ApplicationMaster.
> > Container killed on request. Exit code is 143
> > Container exited with a non-zero exit code 143
> >
> > There are a number of older issues associated with IndexOutOfBounds errors 
> > within the serde, but nothing that appears to specifically match this 
> > error.  This occurs with all tables (including those consisting of 
> > exclusively integers).  Any thoughts?
> >
> > Regards,
> >
> > Bryan Jeffrey
> 
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Hive 0.13.0 - IndexOutOfBounds Exception

Reply via email to