Thanks for your answer, I tested the program with an S3N setup and unfortunately got the same error behavior...
Von: Dean Wampler [mailto:deanwamp...@gmail.com] Gesendet: Donnerstag, 18. April 2013 16:25 An: user@hive.apache.org Betreff: Re: Hive query problem on S3 table I'm not sure what's happening here, but one suggestion; use s3n://... instead of s3://... The "new" version is supposed to provide better performance. dean On Thu, Apr 18, 2013 at 8:43 AM, Tim Bittersohl <t...@innoplexia.com> wrote: Hi, I just found out, that I don't have to change the default file system of Hadoop. The location in the create table command has just to be changed: CREATE EXTERNAL TABLE testtable(nyseVal STRING, cliVal STRING, dateVal STRING, number1Val STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t' LINES TERMINATED BY '\\n' STORED AS TextFile LOCATION "s3://hadoop-bucket/data/" But when I try to access the table with a command that creates a Hadoop job, I get the following error: 13/04/18 15:29:36 ERROR security.UserGroupInformation: PriviledgedActionException as:tim (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /data/NYSE_daily.txt java.io.FileNotFoundException: File does not exist: /data/NYSE_daily.txt at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSy stem.java:807) at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(Combi neFileInputFormat.java:462) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFil eInputFormat.java:256) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInp utFormat.java:212) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge tSplits(HadoopShimsSecure.java:411) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge tSplits(HadoopShimsSecure.java:377) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInp utFormat.java:387) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServ er.java:198) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:644) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:628) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServ er.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 45) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 15) at java.lang.Thread.run(Thread.java:722) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /data/NYSE_daily.txt)' 13/04/18 15:29:36 ERROR exec.Task: Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /data/NYSE_daily.txt)' java.io.FileNotFoundException: File does not exist: /data/NYSE_daily.txt at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSy stem.java:807) at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(Combi neFileInputFormat.java:462) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFil eInputFormat.java:256) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInp utFormat.java:212) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge tSplits(HadoopShimsSecure.java:411) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge tSplits(HadoopShimsSecure.java:377) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInp utFormat.java:387) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja va:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServ er.java:198) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:644) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:628) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServ er.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 45) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 15) at java.lang.Thread.run(Thread.java:722) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask 13/04/18 15:29:36 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask In the internet I found the hint to set the this configuration, to solve the problem: hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat But I just get a RuntimeException doing so: java.lang.RuntimeException: org.apache.hadoop.hive.ql.io.HiveInputFormat at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:333) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServ er.java:198) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:644) at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift Hive.java:628) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServ er.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 45) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 15) at java.lang.Thread.run(Thread.java:722) 13/04/18 15:37:14 ERROR exec.ExecDriver: Exception: org.apache.hadoop.hive.ql.io.HiveInputFormat FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask 13/04/18 15:37:14 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask Im using the Cloudera 0.10.0-cdh4.2.0 version of the Hive libraries. Greetings Tim Bittersohl Software Engineer http://www.innoplexia.de/ci/logo/inno_logo_links%20200x80.png Innoplexia GmbH Mannheimer Str. 175 69123 Heidelberg Tel.: +49 (0) 6221 7198033 <tel:%2B49%20%280%29%206221%207198033> Mobiltel.: +49 (0) 160 99186759 <tel:%2B49%20%280%29%20160%2099186759> Fax: +49 (0) 6221 7198034 <tel:%2B49%20%280%29%206221%207198034> Web: www.innoplexia.com <http://www.innoplexia.com/> Sitz: 69123 Heidelberg, Mannheimer Str. 175 - Steuernummer 32494/62606 - USt. IdNr.: DE 272 871 728 - Geschäftsführer: Prof. Dr. Herbert Schuster -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com
<<image001.png>>