Hadoop mapreduce job to process S3 logs gets hung at INFO mapred.JobClient:  
map 0% reduce 0%.
----------------------------------------------------------------------------------------------

                 Key: HDFS-2583
                 URL: https://issues.apache.org/jira/browse/HDFS-2583
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Nitika Gupta
            Priority: Blocker


I am trying to run a mapreduce job to process the Amazon S3 logs. However, the 
code hangs at INFO mapred.JobClient:  map 0% reduce 0% and does not even 
attempt to launch the tasks. The sample code for the job setup is given below:
public int run(CommandLine cl) throws Exception 
{
       Configuration conf = getConf();
       String inputPath = "";
       String outputPath = "";
       try
       {
           Job job = new Job(conf, "Dummy");
           job.setNumReduceTasks(0);
           job.setMapperClass(Mapper.class);
           inputPath = cl.getOptionValue("input"); //input is an s3n path
           outputPath = cl.getOptionValue("output");
           FileInputFormat.setInputPaths(job, inputPath);
           FileOutputFormat.setOutputPath(job, new Path(outputPath));
           _log.info("Input path set as " + inputPath);
           _log.info("Output path set as " + outputPath);
           job.waitForCompletion(true);
           return 0;
       } catch (Exception ex)
       {
           _log.error(ex);
           return 1;
       }
}
The above code works on the staging machine. However, it fails on the 
production machine which is same as the staging machine with more capacity.

Does anyone know what could be the possible reason for the error? 

Thanks in advance!

Nitika

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to