Hi all,


We have a process that load data from folder that have json files.

We use external table on top of the json folder and then insert into ORC
table.



We get the error below during the load

“Column has wrong number of index entries found: 0 expected: 20”





I validate the JSON data and it look ok.

I notice that I got this error only when I reach amount of file (>40).

I was able to complete the insert when the number of files is less than 40.

I was able to load the data after concatenate all files into one big file.



We are using this method for almost a year and never get this error before,

even while we load folder with thousands of files.





Task ID:

  task_1439882549481_118182_m_000000



URL:


http://hdname:8088/taskdetails.jsp?jobid=job_1439882549481_118182&tipid=task_1439882549481_118182_m_000000

-----

Diagnostic Messages for this Task:

Error: java.lang.IllegalArgumentException: Column has wrong number of index
entries found: 0 expected: 20

                at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:726)

                at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1614)

                at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1996)

                at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2288)

                at
org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:215)

                at
org.apache.hadoop.hive.ql.io.merge.MergeFileMapper.close(MergeFileMapper.java:98)

                at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)

                at
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)

                at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)

                at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)

                at java.security.AccessController.doPrivileged(Native
Method)

                at javax.security.auth.Subject.doAs(Subject.java:415)

                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)

                at
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)





FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.io.merge.MergeFileTask

MapReduce Jobs Launched:

Stage-Stage-1: Map: 2   Cumulative CPU: 55.43 sec   HDFS Read: 266399137
HDFS Write: 17963167 SUCCESS

Stage-Stage-2: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL

Total MapReduce CPU Time Spent: 55 seconds 430 msec









Regard,

Yehuda





*Yehuda Finkelshtein*

*Senior DBA consultant*,

Veracity Ltd.

yeh...@veracity-group.com

www.veracity-group.com

[image: Description: Description: cid:image001.gif@01C8A535.5F5A43E0]

Reply via email to