Hi all,
We have a process that load data from folder that have json files. We use external table on top of the json folder and then insert into ORC table. We get the error below during the load “Column has wrong number of index entries found: 0 expected: 20” I validate the JSON data and it look ok. I notice that I got this error only when I reach amount of file (>40). I was able to complete the insert when the number of files is less than 40. I was able to load the data after concatenate all files into one big file. We are using this method for almost a year and never get this error before, even while we load folder with thousands of files. Task ID: task_1439882549481_118182_m_000000 URL: http://hdname:8088/taskdetails.jsp?jobid=job_1439882549481_118182&tipid=task_1439882549481_118182_m_000000 ----- Diagnostic Messages for this Task: Error: java.lang.IllegalArgumentException: Column has wrong number of index entries found: 0 expected: 20 at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:726) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1614) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1996) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2288) at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:215) at org.apache.hadoop.hive.ql.io.merge.MergeFileMapper.close(MergeFileMapper.java:98) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.io.merge.MergeFileTask MapReduce Jobs Launched: Stage-Stage-1: Map: 2 Cumulative CPU: 55.43 sec HDFS Read: 266399137 HDFS Write: 17963167 SUCCESS Stage-Stage-2: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 55 seconds 430 msec Regard, Yehuda *Yehuda Finkelshtein* *Senior DBA consultant*, Veracity Ltd. yeh...@veracity-group.com www.veracity-group.com [image: Description: Description: cid:image001.gif@01C8A535.5F5A43E0]