It may be. Problem is with text file formats, if the file has any special chars inside and its not properly escaped, you end up in trouble.
In my case we always process the files before loading into a hive table (especially for text data cause a simple new line in a csv file without a proper csvreader will break the code" On Fri, Jul 31, 2015 at 12:53 PM, ravi teja <raviort...@gmail.com> wrote: > OK I will try it out, > > I see this info log in the MR log, should this be a problem? > > 2015-07-31 11:27:47,487 INFO [main] > org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct: Missing fields! > Expected 14 fields but only got 7! Last field end 97 and serialize buffer > end 61. Ignoring similar problems. > > On Fri, Jul 31, 2015 at 12:47 PM, Nitin Pawar <nitinpawar...@gmail.com> > wrote: > >> is there a different output format or the output table bucketed? >> can you try putting a not null condition on join columns >> >> On Fri, Jul 31, 2015 at 12:45 PM, ravi teja <raviort...@gmail.com> wrote: >> >>> Hi Nithin, >>> Thanks for replying. >>> The select query runs like a charm, but only when insertion into a >>> table, then this problem occurs. >>> >>> Please find the answers inline. >>> >>> >>> Thanks, >>> Ravi >>> >>> On Fri, Jul 31, 2015 at 12:34 PM, Nitin Pawar <nitinpawar...@gmail.com> >>> wrote: >>> >>>> sorry but i could not find following info >>>> 1) are you using tez as execution engine? if yes make sure its not >>>> snapshot version *NO* >>>> 2) are you using orc file format? if yes then set flag to ignore >>>> corrupt data *NO, Its Text file format* >>>> 3) are there nulls on your join condition columns *Yes, there might >>>> be some* >>>> if possible share the query and underlying file formats with some >>>> sample data *I cant really share the query.* >>>> >>>> On Fri, Jul 31, 2015 at 12:14 PM, ravi teja <raviort...@gmail.com> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> We are facing issue with our hive query with ArrayIndexoutofbound >>>>> exception. >>>>> I have tried googling out and I see many users facing the same error, >>>>> but no solution yet. This is a blocker for our production and we really >>>>> need help on this. >>>>> >>>>> We are using Hive version : 1.3.0. >>>>> >>>>> Our query is doing multiple joins(right and left). >>>>> >>>>> >>>>> *Diagnostic Messages for this Task:* >>>>> Error: java.lang.RuntimeException: >>>>> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while >>>>> processing row >>>>> {"_col0":48436215,"_col1":87269315,"_col2":"\u0000","_col3":"Customer","_col4":null,"_col5":null,"_col6":"CSS >>>>> Email","_col7":"","_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null} >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172) >>>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) >>>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) >>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>> at >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) >>>>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive >>>>> Runtime Error while processing row >>>>> {"_col0":48436215,"_col1":87269315,"_col2":"\u0000","_col3":"Customer","_col4":null,"_col5":null,"_col6":"CSS >>>>> Email","_col7":"","_col8":null,"_col9":null,"_col10":null,"_col11":null,"_col12":null,"_col13":null} >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) >>>>> ... 8 more >>>>> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >>>>> java.lang.ArrayIndexOutOfBoundsException >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:403) >>>>> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) >>>>> ... 9 more >>>>> Caused by: java.lang.ArrayIndexOutOfBoundsException >>>>> at java.lang.System.arraycopy(Native Method) >>>>> at org.apache.hadoop.io.Text.set(Text.java:225) >>>>> at >>>>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48) >>>>> at >>>>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:267) >>>>> at >>>>> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:204) >>>>> at >>>>> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:558) >>>>> at >>>>> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:383) >>>>> ... 13 more >>>>> >>>>> >>>>> FAILED: Execution Error, return code 2 from >>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> Ravi >>>>> >>>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >> >> >> -- >> Nitin Pawar >> > > -- Nitin Pawar