Hadoop mapreduce job to process S3 logs gets hung at INFO mapred.JobClient: map 0% reduce 0%. ----------------------------------------------------------------------------------------------
Key: HDFS-2583 URL: https://issues.apache.org/jira/browse/HDFS-2583 Project: Hadoop HDFS Issue Type: Bug Reporter: Nitika Gupta Priority: Blocker I am trying to run a mapreduce job to process the Amazon S3 logs. However, the code hangs at INFO mapred.JobClient: map 0% reduce 0% and does not even attempt to launch the tasks. The sample code for the job setup is given below: public int run(CommandLine cl) throws Exception { Configuration conf = getConf(); String inputPath = ""; String outputPath = ""; try { Job job = new Job(conf, "Dummy"); job.setNumReduceTasks(0); job.setMapperClass(Mapper.class); inputPath = cl.getOptionValue("input"); //input is an s3n path outputPath = cl.getOptionValue("output"); FileInputFormat.setInputPaths(job, inputPath); FileOutputFormat.setOutputPath(job, new Path(outputPath)); _log.info("Input path set as " + inputPath); _log.info("Output path set as " + outputPath); job.waitForCompletion(true); return 0; } catch (Exception ex) { _log.error(ex); return 1; } } The above code works on the staging machine. However, it fails on the production machine which is same as the staging machine with more capacity. Does anyone know what could be the possible reason for the error? Thanks in advance! Nitika -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira