If I execute 'hadoop distcp hdfs:///tmp/test1.txt ftp://ftpuser:ftpuser@hostname/tmp/', the exception will be: attempt_201304222240_0006_m_000000_1: log4j:ERROR Could not connect to remote log4j server at [localhost]. We will try again later. 13/04/23 19:31:33 INFO mapred.JobClient: Task Id : attempt_201304222240_0006_m_000000_2, Status : FAILED java.io.IOException: Cannot rename parent(source): ftp://ftpuser:ftpuser@hostname/tmp/_distcp_logs_o6gzfy/_temporary/_attempt_201304222240_0006_m_000000_2, parent(destination): ftp://ftpuser:ftpu...@bdvm104.svl.ibm.com/tmp/_distcp_logs_o6gzfy at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:547) at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:512) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172) at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132) at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:221) at org.apache.hadoop.mapred.Task.commit(Task.java:1019) at org.apache.hadoop.mapred.Task.done(Task.java:889) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:373) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(AccessController.java:310) at javax.security.auth.Subject.doAs(Subject.java:573) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249)
2013/4/24 sam liu <samliuhad...@gmail.com> > Now, I can successfully run "hadoop distcp > ftp://ftpuser:ftpuser@hostname/tmp/test1.txt > hdfs:///tmp/test1.txt" > > But failed on "hadoop distcp hdfs:///tmp/test1.txt > ftp://ftpuser:ftpuser@hostname/tmp/test1.txt.v1", it returns issue like: > attempt_201304222240_0005_m_000000_1: log4j:ERROR Could not connect to > remote log4j server at [localhost]. We will try again later. > 13/04/23 18:59:05 INFO mapred.JobClient: Task Id : > attempt_201304222240_0005_m_000000_2, Status : FAILED > java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 > at > org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:310) > at javax.security.auth.Subject.doAs(Subject.java:573) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > 2013/4/24 sam liu <samliuhad...@gmail.com> > >> I can success execute "hadoop fs -ls >> ftp://hadoopadm:xxxxxxxx@ftphostname<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>", >> it returns the root path of linux system. >> >> But failed to execute "hadoop fs -rm >> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here", and it returns: >> rm: Delete failed >> ftp://hadoopadm:xxxxxxxx<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here> >> @ftphostname/some/path/here<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here> >> >> >> 2013/4/24 Daryn Sharp <da...@yahoo-inc.com> >> >>> The ftp fs is listing the contents of the given path's parent >>> directory, and then trying to match the basename of each child path >>> returned against the basename of the given path – quite inefficient… The >>> FNF is it didn't find a match for the basename. It may be that the ftp >>> server isn't returning a listing in exactly the expected format so it's >>> being parsed incorrectly. >>> >>> Does "hadoop fs -ls ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here" >>> work? Or "hadoop fs -rm >>> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here"? Those cmds >>> should exercise the same code paths where you are experiencing errors. >>> >>> Daryn >>> >>> On Apr 22, 2013, at 9:06 PM, sam liu wrote: >>> >>> I encountered IOException and FileNotFoundException: >>> >>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >>> attempt_201304160910_2135_m_ >>> 000000_0, Status : FAILED >>> java.io.IOException: The temporary job-output directory >>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >>> exist! >>> at >>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >>> at >>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >>> at >>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >>> at >>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820) >>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>> at >>> java.security.AccessController.doPrivileged(AccessController.java:310) >>> at javax.security.auth.Subject.doAs(Subject.java:573) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> >>> >>> ... ... >>> >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >>> job_201304160910_2135 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >>> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>> reduces waiting after reserving slots (ms)=0 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>> maps waiting after reserving slots (ms)=0 >>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >>> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: >>> task_201304160910_2135_m_000000 >>> With failures, global counters are inaccurate; consider running with -i >>> Copy failed: java.io.FileNotFoundException: File >>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not >>> exist. >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) >>> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) >>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) >>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) >>> >>> >>> 2013/4/23 sam liu <samliuhad...@gmail.com> >>> >>>> I encountered IOException and FileNotFoundException: >>>> >>>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >>>> attempt_201304160910_2135_m_000000_0, Status : FAILED >>>> java.io.IOException: The temporary job-output directory >>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >>>> exist! >>>> at >>>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >>>> at >>>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >>>> at >>>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >>>> at >>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820) >>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>>> at >>>> java.security.AccessController.doPrivileged(AccessController.java:310) >>>> at javax.security.auth.Subject.doAs(Subject.java:573) >>>> at >>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >>>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>>> >>>> >>>> ... ... >>>> >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >>>> job_201304160910_2135 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>>> reduces waiting after reserving slots (ms)=0 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>>> maps waiting after reserving slots (ms)=0 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >>>> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: >>>> task_201304160910_2135_m_000000 >>>> With failures, global counters are inaccurate; consider running with -i >>>> Copy failed: java.io.FileNotFoundException: File >>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not >>>> exist. >>>> at >>>> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) >>>> at >>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) >>>> at >>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) >>>> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) >>>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) >>>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) >>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) >>>> >>>> >>>> 2013/4/23 Daryn Sharp <da...@yahoo-inc.com> >>>> >>>>> I believe it should work… What error message did you receive? >>>>> >>>>> Daryn >>>>> >>>>> On Apr 22, 2013, at 3:45 AM, sam liu wrote: >>>>> >>>>> > Hi Experts, >>>>> > >>>>> > I failed to execute following command, does not Distcp support FTP >>>>> protocol? >>>>> > >>>>> > hadoop distcp ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/file1.txt >>>>> > hdfs:///tmp/file1.txt >>>>> > >>>>> > Thanks! >>>>> >>>>> >>>> >>> >>> >> >