java.lang.ArrayIndexOutOfBoundsException in getSplitHosts

Saumitra Shahapure Mon, 25 Apr 2016 03:46:10 -0700

Hello,

I am using using Hive 0.13.1 in EMR and trying to create Hive table on top
of our custom file system (which is a thin wrapper on top of S3) and I am
getting error while accessing the data in the table. Stack trace and
command history below.


I had a doubt that CombineFileInputFormat is trying to access the splits in
incorrect way, but HiveInputFormat is also causing the same problem. Has
anyone seen such problem before? Note that SerDe and FileSystem are both
custom. Could either of those causing this problem?


hive> add jar /home/hadoop/logprocessing-pig-combined.jar;
> Added /home/hadoop/logprocessing-pig-combined.jar to class path
> Added resource: /home/hadoop/logprocessing-pig-combined.jar
> hive> Create external table nulf
>     > (
>     > tm STRING
>     > )
>     > ROW FORMAT SERDE 'logprocessing.nulf.basic.BasicHiveSerDe'
>     > location 'cda://path/to/logs/';
> OK
> Time taken: 6.706 seconds
> hive> set hive.input.format= org.apache.hadoop.hive.ql.io.HiveInputFormat;
> hive> select count(*) from nulf;
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> java.lang.ArrayIndexOutOfBoundsException: 1
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplitHosts(FileInputFormat.java:529)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:320)
> at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:290)
> at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:371)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
> at
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
> at
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430)
> at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:803)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Job Submission failed with exception
> 'java.lang.ArrayIndexOutOfBoundsException(1)'
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask


-- Saumitra S. Shahapure

java.lang.ArrayIndexOutOfBoundsException in getSplitHosts

Reply via email to