Hello, I am using using Hive 0.13.1 in EMR and trying to create Hive table on top of our custom file system (which is a thin wrapper on top of S3) and I am getting error while accessing the data in the table. Stack trace and command history below.
I had a doubt that CombineFileInputFormat is trying to access the splits in incorrect way, but HiveInputFormat is also causing the same problem. Has anyone seen such problem before? Note that SerDe and FileSystem are both custom. Could either of those causing this problem? hive> add jar /home/hadoop/logprocessing-pig-combined.jar; > Added /home/hadoop/logprocessing-pig-combined.jar to class path > Added resource: /home/hadoop/logprocessing-pig-combined.jar > hive> Create external table nulf > > ( > > tm STRING > > ) > > ROW FORMAT SERDE 'logprocessing.nulf.basic.BasicHiveSerDe' > > location 'cda://path/to/logs/'; > OK > Time taken: 6.706 seconds > hive> set hive.input.format= org.apache.hadoop.hive.ql.io.HiveInputFormat; > hive> select count(*) from nulf; > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks determined at compile time: 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer=<number> > In order to limit the maximum number of reducers: > set hive.exec.reducers.max=<number> > In order to set a constant number of reducers: > set mapreduce.job.reduces=<number> > java.lang.ArrayIndexOutOfBoundsException: 1 > at > org.apache.hadoop.mapred.FileInputFormat.getSplitHosts(FileInputFormat.java:529) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:320) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:290) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:371) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) > at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420) > at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1503) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1270) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1088) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:275) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:227) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:430) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:803) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:636) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > Job Submission failed with exception > 'java.lang.ArrayIndexOutOfBoundsException(1)' > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask -- Saumitra S. Shahapure