I could execute: - hadoop fs -ls ftp://ftpuser:ftpuser@hostname/tmp/testdir - hadoop fs -lsr ftp://ftpuser:ftpuser@hostname/tmp/testdir
Is there any special requirement to ftp configurations for running distcp tool? In my env, if issue 'hadoop fs -lsr ftp://ftpuser:ftpuser@hostname', it will return the root path of my linux file system. 2013/4/24 Daryn Sharp <da...@yahoo-inc.com> > Listing the root is a bit of a special case that is different than N-many > directories deep. Can you list > ftp://hadoopadm:xxxxxxxx@ftphostname/some/dir/file or > ftp://hadoopadm:xxxxxxxx@ftphostname/some/dir? I suspect ftp fs has a > bug, so they will fail too. > > On Apr 23, 2013, at 8:03 PM, sam liu wrote: > > I can success execute "hadoop fs -ls > ftp://hadoopadm:xxxxxxxx@ftphostname<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>", > it returns the root path of linux system. > > But failed to execute "hadoop fs -rm > ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here", and it returns: > rm: Delete failed > ftp://hadoopadm:xxxxxxxx<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here> > @ftphostname/some/path/here<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here> > > > 2013/4/24 Daryn Sharp <da...@yahoo-inc.com> > >> The ftp fs is listing the contents of the given path's parent directory, >> and then trying to match the basename of each child path returned against >> the basename of the given path – quite inefficient… The FNF is it didn't >> find a match for the basename. It may be that the ftp server isn't >> returning a listing in exactly the expected format so it's being parsed >> incorrectly. >> >> Does "hadoop fs -ls ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here" >> work? Or "hadoop fs -rm >> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here"? Those cmds should >> exercise the same code paths where you are experiencing errors. >> >> Daryn >> >> On Apr 22, 2013, at 9:06 PM, sam liu wrote: >> >> I encountered IOException and FileNotFoundException: >> >> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >> attempt_201304160910_2135_m_ >> 000000_0, Status : FAILED >> java.io.IOException: The temporary job-output directory >> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >> exist! >> at >> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >> at >> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >> at >> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >> at >> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> at >> java.security.AccessController.doPrivileged(AccessController.java:310) >> at javax.security.auth.Subject.doAs(Subject.java:573) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >> at org.apache.hadoop.mapred.Child.main(Child.java:249) >> >> >> ... ... >> >> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >> job_201304160910_2135 >> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >> reduces waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps >> waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: >> task_201304160910_2135_m_000000 >> With failures, global counters are inaccurate; consider running with -i >> Copy failed: java.io.FileNotFoundException: File >> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not >> exist. >> at >> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) >> at >> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) >> at >> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) >> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) >> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) >> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) >> >> >> 2013/4/23 sam liu <samliuhad...@gmail.com> >> >>> I encountered IOException and FileNotFoundException: >>> >>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >>> attempt_201304160910_2135_m_000000_0, Status : FAILED >>> java.io.IOException: The temporary job-output directory >>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >>> exist! >>> at >>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >>> at >>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >>> at >>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >>> at >>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820) >>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>> at >>> java.security.AccessController.doPrivileged(AccessController.java:310) >>> at javax.security.auth.Subject.doAs(Subject.java:573) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> >>> >>> ... ... >>> >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >>> job_201304160910_2135 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >>> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>> reduces waiting after reserving slots (ms)=0 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >>> maps waiting after reserving slots (ms)=0 >>> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >>> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: >>> task_201304160910_2135_m_000000 >>> With failures, global counters are inaccurate; consider running with -i >>> Copy failed: java.io.FileNotFoundException: File >>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not >>> exist. >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) >>> at >>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) >>> at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) >>> at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) >>> at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) >>> >>> >>> 2013/4/23 Daryn Sharp <da...@yahoo-inc.com> >>> >>>> I believe it should work… What error message did you receive? >>>> >>>> Daryn >>>> >>>> On Apr 22, 2013, at 3:45 AM, sam liu wrote: >>>> >>>> > Hi Experts, >>>> > >>>> > I failed to execute following command, does not Distcp support FTP >>>> protocol? >>>> > >>>> > hadoop distcp ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/file1.txt >>>> > hdfs:///tmp/file1.txt >>>> > >>>> > Thanks! >>>> >>>> >>> >> >> > >