Re: Why failed to use Distcp over FTP protocol?
I encountered IOException and FileNotFoundException: 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : attempt_201304160910_2135_m_ 00_0, Status : FAILED java.io.IOException: The temporary job-output directory ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporary doesn't exist! at org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) at org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) at org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) at org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.(MapTask.java:820) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(AccessController.java:310) at javax.security.auth.Subject.doAs(Subject.java:573) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) at org.apache.hadoop.mapred.Child.main(Child.java:249) ... ... 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: job_201304160910_2135 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201304160910_2135_m_00 With failures, global counters are inaccurate; consider running with -i Copy failed: java.io.FileNotFoundException: File ftp://hadoopadm:@ftphostname/tmp/_distcp_tmp_i74spu does not exist. at org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) at org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) at org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) 2013/4/23 sam liu > I encountered IOException and FileNotFoundException: > > 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : > attempt_201304160910_2135_m_00_0, Status : FAILED > java.io.IOException: The temporary job-output directory > ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporary > doesn't exist! > at > org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) > at > org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) > at > org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) > at > org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.(MapTask.java:820) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:310) > at javax.security.auth.Subject.doAs(Subject.java:573) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > ... ... > > 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: > job_201304160910_2135 > 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 > 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters > 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 > 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 > 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 > 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 > 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map Tasks > exceeded allowed limit. FailedCoun
For release 2.0.X, about when will have a stable release?
Hi, The current release of 2.0.X is 2.0.3-alpha, and about when will have a stable release? Sam Liu Thanks!
Encounter 'error: possibly undefined macro: AC_PROG_LIBTOOL', when build Hadoop project in SUSE 11(x86_64)
Hi Experts, I failed to build Hadoop 1.1.1 source code project in SUSE 11(x86_64), and encounter a issue: [exec] configure.ac:48: error: possibly undefined macro: AC_PROG_LIBTOOL [exec] If this token and others are legitimate, please use m4_pattern_allow. [exec] See the Autoconf documentation. [exec] autoreconf: /usr/local/bin/autoconf failed with exit status: 1 Even after installing libtool.x86_64 2.2.6b-13.16.1 on it, the issue still exists. Anyone knows this issue? Thanks! Sam Liu
Re: Encounter 'error: possibly undefined macro: AC_PROG_LIBTOOL', when build Hadoop project in SUSE 11(x86_64)
autoconf.noarch 2.68-4.1 2013/4/23 Harsh J > What version of autoconf are you using? > > On Tue, Apr 23, 2013 at 12:18 PM, sam liu wrote: > > Hi Experts, > > > > I failed to build Hadoop 1.1.1 source code project in SUSE 11(x86_64), > and > > encounter a issue: > > > > [exec] configure.ac:48: error: possibly undefined macro: > > AC_PROG_LIBTOOL > > [exec] If this token and others are legitimate, please use > > m4_pattern_allow. > > [exec] See the Autoconf documentation. > > [exec] autoreconf: /usr/local/bin/autoconf failed with exit status: > 1 > > > > Even after installing libtool.x86_64 2.2.6b-13.16.1 on it, the issue > still > > exists. > > > > Anyone knows this issue? > > > > Thanks! > > > > Sam Liu > > > > -- > Harsh J >
Re: Why failed to use Distcp over FTP protocol?
I can success execute "hadoop fs -ls ftp://hadoopadm:@ftphostname<ftp://hadoopadm:@ftphostname/some/path/here>", it returns the root path of linux system. But failed to execute "hadoop fs -rm ftp://hadoopadm:@ftphostname/some/path/here";, and it returns: rm: Delete failed ftp://hadoopadm:<ftp://hadoopadm:@ftphostname/some/path/here> @ftphostname/some/path/here<ftp://hadoopadm:@ftphostname/some/path/here> 2013/4/24 Daryn Sharp > The ftp fs is listing the contents of the given path's parent directory, > and then trying to match the basename of each child path returned against > the basename of the given path – quite inefficient… The FNF is it didn't > find a match for the basename. It may be that the ftp server isn't > returning a listing in exactly the expected format so it's being parsed > incorrectly. > > Does "hadoop fs -ls ftp://hadoopadm:@ftphostname/some/path/here"; > work? Or "hadoop fs -rm > ftp://hadoopadm:@ftphostname/some/path/here";? Those cmds should > exercise the same code paths where you are experiencing errors. > > Daryn > > On Apr 22, 2013, at 9:06 PM, sam liu wrote: > > I encountered IOException and FileNotFoundException: > > 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : > attempt_201304160910_2135_m_ > 00_0, Status : FAILED > java.io.IOException: The temporary job-output directory > ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't > exist! > at > org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) > at > org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) > at > org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) > at > org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.(MapTask.java:820) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:310) > at javax.security.auth.Subject.doAs(Subject.java:573) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > ... ... > > 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: > job_201304160910_2135 > 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 > 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters > 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 > 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 > 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 > 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all > reduces waiting after reserving slots (ms)=0 > 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps > waiting after reserving slots (ms)=0 > 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 > 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map Tasks > exceeded allowed limit. FailedCount: 1. LastFailedTask: > task_201304160910_2135_m_00 > With failures, global counters are inaccurate; consider running with -i > Copy failed: java.io.FileNotFoundException: File > ftp://hadoopadm:@ftphostname/tmp/_distcp_tmp_i74spu does not > exist. > at > org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) > at > org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) > at > org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279) > at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963) > at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) > > > 2013/4/23 sam liu > >> I encountered IOException and FileNotFoundException: >> >> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >> attempt_201304160910_2135_m_00_0, Status : FAILED >> java.io.IOException: The temporary job-output directory >> ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >> exist! >> at >> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >> at >> org.apache.hadoop.mapred.File
Re: Why failed to use Distcp over FTP protocol?
Now, I can successfully run "hadoop distcp ftp://ftpuser:ftpuser@hostname/tmp/test1.txt hdfs:///tmp/test1.txt" But failed on "hadoop distcp hdfs:///tmp/test1.txt ftp://ftpuser:ftpuser@hostname/tmp/test1.txt.v1";, it returns issue like: attempt_20130440_0005_m_00_1: log4j:ERROR Could not connect to remote log4j server at [localhost]. We will try again later. 13/04/23 18:59:05 INFO mapred.JobClient: Task Id : attempt_20130440_0005_m_00_2, Status : FAILED java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 at org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(AccessController.java:310) at javax.security.auth.Subject.doAs(Subject.java:573) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013/4/24 sam liu > I can success execute "hadoop fs -ls > ftp://hadoopadm:@ftphostname<ftp://hadoopadm:@ftphostname/some/path/here>", > it returns the root path of linux system. > > But failed to execute "hadoop fs -rm > ftp://hadoopadm:@ftphostname/some/path/here";, and it returns: > rm: Delete failed > ftp://hadoopadm:<ftp://hadoopadm:@ftphostname/some/path/here> > @ftphostname/some/path/here<ftp://hadoopadm:@ftphostname/some/path/here> > > > 2013/4/24 Daryn Sharp > >> The ftp fs is listing the contents of the given path's parent directory, >> and then trying to match the basename of each child path returned against >> the basename of the given path – quite inefficient… The FNF is it didn't >> find a match for the basename. It may be that the ftp server isn't >> returning a listing in exactly the expected format so it's being parsed >> incorrectly. >> >> Does "hadoop fs -ls ftp://hadoopadm:@ftphostname/some/path/here"; >> work? Or "hadoop fs -rm >> ftp://hadoopadm:@ftphostname/some/path/here";? Those cmds should >> exercise the same code paths where you are experiencing errors. >> >> Daryn >> >> On Apr 22, 2013, at 9:06 PM, sam liu wrote: >> >> I encountered IOException and FileNotFoundException: >> >> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >> attempt_201304160910_2135_m_ >> 00_0, Status : FAILED >> java.io.IOException: The temporary job-output directory >> ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >> exist! >> at >> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >> at >> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >> at >> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >> at >> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.(MapTask.java:820) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> at >> java.security.AccessController.doPrivileged(AccessController.java:310) >> at javax.security.auth.Subject.doAs(Subject.java:573) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >> at org.apache.hadoop.mapred.Child.main(Child.java:249) >> >> >> ... ... >> >> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >> job_201304160910_2135 >> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >> reduces waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps >> waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >> Tasks exceeded allowed limit. FailedCount
Re: Why failed to use Distcp over FTP protocol?
If I execute 'hadoop distcp hdfs:///tmp/test1.txt ftp://ftpuser:ftpuser@hostname/tmp/', the exception will be: attempt_20130440_0006_m_00_1: log4j:ERROR Could not connect to remote log4j server at [localhost]. We will try again later. 13/04/23 19:31:33 INFO mapred.JobClient: Task Id : attempt_20130440_0006_m_00_2, Status : FAILED java.io.IOException: Cannot rename parent(source): ftp://ftpuser:ftpuser@hostname/tmp/_distcp_logs_o6gzfy/_temporary/_attempt_20130440_0006_m_00_2, parent(destination): ftp://ftpuser:ftpu...@bdvm104.svl.ibm.com/tmp/_distcp_logs_o6gzfy at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:547) at org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:512) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154) at org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172) at org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132) at org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:221) at org.apache.hadoop.mapred.Task.commit(Task.java:1019) at org.apache.hadoop.mapred.Task.done(Task.java:889) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:373) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(AccessController.java:310) at javax.security.auth.Subject.doAs(Subject.java:573) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2013/4/24 sam liu > Now, I can successfully run "hadoop distcp > ftp://ftpuser:ftpuser@hostname/tmp/test1.txt > hdfs:///tmp/test1.txt" > > But failed on "hadoop distcp hdfs:///tmp/test1.txt > ftp://ftpuser:ftpuser@hostname/tmp/test1.txt.v1";, it returns issue like: > attempt_20130440_0005_m_00_1: log4j:ERROR Could not connect to > remote log4j server at [localhost]. We will try again later. > 13/04/23 18:59:05 INFO mapred.JobClient: Task Id : > attempt_20130440_0005_m_00_2, Status : FAILED > java.io.IOException: Copied: 0 Skipped: 0 Failed: 1 > at > org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:310) > at javax.security.auth.Subject.doAs(Subject.java:573) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > 2013/4/24 sam liu > >> I can success execute "hadoop fs -ls >> ftp://hadoopadm:@ftphostname<ftp://hadoopadm:@ftphostname/some/path/here>", >> it returns the root path of linux system. >> >> But failed to execute "hadoop fs -rm >> ftp://hadoopadm:@ftphostname/some/path/here";, and it returns: >> rm: Delete failed >> ftp://hadoopadm:<ftp://hadoopadm:@ftphostname/some/path/here> >> @ftphostname/some/path/here<ftp://hadoopadm:@ftphostname/some/path/here> >> >> >> 2013/4/24 Daryn Sharp >> >>> The ftp fs is listing the contents of the given path's parent >>> directory, and then trying to match the basename of each child path >>> returned against the basename of the given path – quite inefficient… The >>> FNF is it didn't find a match for the basename. It may be that the ftp >>> server isn't returning a listing in exactly the expected format so it's >>> being parsed incorrectly. >>> >>> Does "hadoop fs -ls ftp://hadoopadm:@ftphostname/some/path/here"; >>> work? Or "hadoop fs -rm >>> ftp://hadoopadm:@ftphostname/some/path/here";? Those cmds >>> should exercise the same code paths where you are experiencing errors. >>> >>> Daryn >>> >>> On Apr 22, 2013, at 9:06 PM, sam liu wrote: >>> >>> I encountered IOException and FileNotFoundException: >>> >>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >>> attempt_201304160910_2135_m_ >>> 00_0, Status : FAILED >>> java.io.IOException: The temporary job-output directory >>> ftp://hadoopadm:
Re: Why failed to use Distcp over FTP protocol?
I could execute: - hadoop fs -ls ftp://ftpuser:ftpuser@hostname/tmp/testdir - hadoop fs -lsr ftp://ftpuser:ftpuser@hostname/tmp/testdir Is there any special requirement to ftp configurations for running distcp tool? In my env, if issue 'hadoop fs -lsr ftp://ftpuser:ftpuser@hostname', it will return the root path of my linux file system. 2013/4/24 Daryn Sharp > Listing the root is a bit of a special case that is different than N-many > directories deep. Can you list > ftp://hadoopadm:@ftphostname/some/dir/file or > ftp://hadoopadm:@ftphostname/some/dir? I suspect ftp fs has a > bug, so they will fail too. > > On Apr 23, 2013, at 8:03 PM, sam liu wrote: > > I can success execute "hadoop fs -ls > ftp://hadoopadm:@ftphostname<ftp://hadoopadm:@ftphostname/some/path/here>", > it returns the root path of linux system. > > But failed to execute "hadoop fs -rm > ftp://hadoopadm:@ftphostname/some/path/here";, and it returns: > rm: Delete failed > ftp://hadoopadm:<ftp://hadoopadm:@ftphostname/some/path/here> > @ftphostname/some/path/here<ftp://hadoopadm:@ftphostname/some/path/here> > > > 2013/4/24 Daryn Sharp > >> The ftp fs is listing the contents of the given path's parent directory, >> and then trying to match the basename of each child path returned against >> the basename of the given path – quite inefficient… The FNF is it didn't >> find a match for the basename. It may be that the ftp server isn't >> returning a listing in exactly the expected format so it's being parsed >> incorrectly. >> >> Does "hadoop fs -ls ftp://hadoopadm:@ftphostname/some/path/here"; >> work? Or "hadoop fs -rm >> ftp://hadoopadm:@ftphostname/some/path/here";? Those cmds should >> exercise the same code paths where you are experiencing errors. >> >> Daryn >> >> On Apr 22, 2013, at 9:06 PM, sam liu wrote: >> >> I encountered IOException and FileNotFoundException: >> >> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id : >> attempt_201304160910_2135_m_ >> 00_0, Status : FAILED >> java.io.IOException: The temporary job-output directory >> ftp://hadoopadm:@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't >> exist! >> at >> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250) >> at >> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244) >> at >> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116) >> at >> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.(MapTask.java:820) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> at >> java.security.AccessController.doPrivileged(AccessController.java:310) >> at javax.security.auth.Subject.doAs(Subject.java:573) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144) >> at org.apache.hadoop.mapred.Child.main(Child.java:249) >> >> >> ... ... >> >> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete: >> job_201304160910_2135 >> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Counters >> 13/04/17 17:11:42 INFO mapred.JobClient: Failed map tasks=1 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33785 >> 13/04/17 17:11:42 INFO mapred.JobClient: Launched map tasks=4 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all >> reduces waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: Total time spent by all maps >> waiting after reserving slots (ms)=0 >> 13/04/17 17:11:42 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=6436 >> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map >> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: >> task_201304160910_2135_m_00 >> With failures, global counters are inaccurate; consider running with -i >> Copy failed: java.io.FileNotFoundException: File >> ftp://hadoopadm:@ftphostname/tmp/_distcp_tmp_i74spu does not >> exist. >> at >> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419) >> at >> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302) >> a
How to build hadoop-2.0.3-alpha-src project to get a project like hadoop-2.0.3-alpha?
Hi, I got hadoop-2.0.3-alpha-src.tar.gz and hadoop-2.0.3-alpha.tar.gz, but found they have different structures as below: - hadoop-2.0.3-alpha contains folder/file: bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share - hadoop-2.0.3-alpha-src.tar.gz contains folder/file: BUILDING.txt hadoop-client hadoop-hdfs-project hadoop-project hadoop-yarn-project pom.xml releasenotes.HDFS.2.0.3-alpha.html dev-supporthadoop-common-project hadoop-mapreduce-project hadoop-project-dist LICENSE.txt README.txt releasenotes.MAPREDUCE.2.0.3-alpha.html hadoop-assemblies hadoop-disthadoop-minicluster hadoop-tools NOTICE.txt releasenotes.HADOOP.2.0.3-alpha.html releasenotes.YARN.2.0.3-alpha.html And then, in hadoop-2.0.3-alpha-src, I successfully run 'mvn package -Pdist -DskipTests -Dtar', but do not know how to get a target project which has similar folder/file structure like the downloaded 'hadoop-2.0.3-alpha' project after building. Any suggestions? Thanks! Sam Liu
Failed to install openssl-devel 1.0.0-20.el6 on OS RHELS 6.3 x86_64
Hi, For building Hadoop on OS RHELS 6.3 x86_64, I tried to install openssl-devel, but failed. The exception is as below. The required version of glibc-common is 2.12-1.47.el6, but mine installed one is 2.12-1.80.el6 and newer than it. Why does it fail? How to resolve this issue? ---> Package nss-softokn-freebl.i686 0:3.12.9-11.el6 will be installed --> Finished Dependency Resolution Error: Package: glibc-2.12-1.47.el6.i686 (rhel-cd) Requires: glibc-common = 2.12-1.47.el6 Installed: glibc-common-2.12-1.80.el6.x86_64 (@anaconda-RedHatEnterpriseLinux-201206132210.x86_64/6.3) glibc-common = 2.12-1.80.el6 Available: glibc-common-2.12-1.47.el6.x86_64 (rhel-cd) glibc-common = 2.12-1.47.el6 You could try using --skip-broken to work around the problem You could try running: rpm -Va --nofiles --nodigest Sam Liu Thanks!
Why could not find finished jobs in yarn.resourcemanager.webapp.address?
Hi, I launched yarn and its webapp on port 18088, and then successfully launched and executed some test MR jobs like 'hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.3-alpha.jar pi 2 30'. But, when login the web console in browser, I could find any finished jobs in the 'FINISHED Applications' tab. Why? Thanks! Sam Liu
Re: Why could not find finished jobs in yarn.resourcemanager.webapp.address?
Can anyone help this issue? Thanks! 2013/5/2 sam liu > Hi, > > I launched yarn and its webapp on port 18088, and then successfully > launched and executed some test MR jobs like 'hadoop jar > share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.3-alpha.jar pi 2 30'. > > But, when login the web console in browser, I could find any finished jobs > in the 'FINISHED Applications' tab. Why? > > > Thanks! > > Sam Liu >
What's the difference between release 1.1.1, 1.2.0 and 3.0.0?
Hi Experts, Who can answer my following questions? We want to know which release is suitable to us.Thanks a lot! - What's the difference between release 1.1.1, 1.2.0 and 3.0.0? - What are their release time? Sam Liu
Re: What's the difference between release 1.1.1, 1.2.0 and 3.0.0?
Hi Harsh, Thanks very much for your detailed explanation! For 1.x line, we really want to know which release could be used by us, so have further questions: - Is 1.2.0 more advanced that 1.1.1? - Do we have general release time of above two releases? For 2.x line: - Will its stable release contain all fixes and features of 1.x line? - Can we know the general release time of the coming stable release of 2.x line? Sam Liu 2012/11/28 Harsh J > Hi, > > [Speaking with HDFS in mind] > > The 1.x line is the current stable/maintenance line that has features > similar to that of 0.20.x before it, with append+sync and security features > added on top of the pre-existing HDFS. > > The 2.x line carries several fixes and brand-new features (high > availability, protobuf RPCs, federated namenodes, etc.) for HDFS, along > with several performance optimizations, and is quite a big improvement over > the 1.x line. The last release of 2.x, was 2.0.2, released a couple of > months ago IIRC. This branch is very new, and is approaching full stability > soon (Although, there's been no blocker kinda problems with HDFS at least, > AFAICT). > > 3.x is an placeholder value for "trunk", it has not been branched for any > release yet. We are currently focussed on improving the 2.x line further. > > > On Wed, Nov 28, 2012 at 9:01 AM, sam liu wrote: > > > Hi Experts, > > > > Who can answer my following questions? We want to know which release is > > suitable to us.Thanks a lot! > > > > - What's the difference between release 1.1.1, 1.2.0 and 3.0.0? > > - What are their release time? > > > > Sam Liu > > > > > > -- > Harsh J >
[jira] [Created] (HDFS-4527) For shortening the time of TaskTracker heartbeat, decouple the statics collection operations
sam liu created HDFS-4527: - Summary: For shortening the time of TaskTracker heartbeat, decouple the statics collection operations Key: HDFS-4527 URL: https://issues.apache.org/jira/browse/HDFS-4527 Project: Hadoop HDFS Issue Type: Improvement Components: performance Affects Versions: 1.1.1 Reporter: sam liu In each heartbeat of TaskTracker, it will calculate some system statics, like the free disk space, available virtual/physical memory, cpu usage, etc. However, it's not necessary to calculate all the statics in every heartbeat, and this will consume many system resource and impace the performance of TaskTracker heartbeat. Furthermore, the characteristics of system properties(disk, memory, cpu) are different and it's better to collect their statics in different intervals. To reduce the latency of TaskTracker heartbeat, one solution is to decouple all the system statics collection operations from it, and issue separate threads to do the statics collection works when the TaskTracker starts. The threads could be three: the first one is to collect cpu related statics in a short interval; the second one is to collect memory related statics in a normal interval; the third one is to collect disk related statics in a long interval. And all the interval could be customized by the parameter "mapred.stats.collection.interval" in the mapred-site.xml. At last, the heartbeat could get values of system statics from the memory directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5046) Hang when add/remove a datanode into/from a 2 datanode cluster
sam liu created HDFS-5046: - Summary: Hang when add/remove a datanode into/from a 2 datanode cluster Key: HDFS-5046 URL: https://issues.apache.org/jira/browse/HDFS-5046 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 1.1.1 Environment: Red Hat Enterprise Linux Server release 5.3, 64 bit Reporter: sam liu 1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And, in hdfs-site.xml, set the 'dfs.replication' to 2 2. Add node dn3 into the cluster as a new datanode, and did not change the 'dfs.replication' value in hdfs-site.xml and keep it as 2 note: step 2 passed 3. Decommission dn3 from the cluster Expected result: dn3 could be decommissioned successfully Actual result: a). decommission progress hangs and the status always be 'Waiting DataNode status: Decommissioned'. But, if I execute 'hadoop dfs -setrep -R 2 /', the decommission continues and will be completed finally. b). However, if the initial cluster includes >= 3 datanodes, this issue won't be encountered when add/remove another datanode. For example, if I setup a cluster with 3 datanodes, and then I can successfully add the 4th datanode into it, and then also can successfully remove the 4th datanode from the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-7002) Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1
sam liu created HDFS-7002: - Summary: Failed to rolling upgrade hdfs from 2.2.0 to 2.4.1 Key: HDFS-7002 URL: https://issues.apache.org/jira/browse/HDFS-7002 Project: Hadoop HDFS Issue Type: Bug Components: journal-node, namenode, qjm Affects Versions: 2.4.1, 2.2.0 Reporter: sam liu Priority: Blocker -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7053) Failed to rollback hdfs version from 2.4.1 to 2.2.0
sam liu created HDFS-7053: - Summary: Failed to rollback hdfs version from 2.4.1 to 2.2.0 Key: HDFS-7053 URL: https://issues.apache.org/jira/browse/HDFS-7053 Project: Hadoop HDFS Issue Type: Bug Components: ha, namenode Affects Versions: 2.4.1 Reporter: sam liu Priority: Blocker I can successfully upgrade from 2.2.0 to 2.4.1 with QJM HA enabled and with downtime, but failed to rollback from 2.4.1 to 2.2.0. The error message: 2014-09-10 16:50:29,599 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join org.apache.hadoop.HadoopIllegalArgumentException: Invalid startup option. Cannot perform DFS upgrade with HA enabled. at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1207) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1320) 2014-09-10 16:50:29,601 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7114) Secondary NameNode failed to rollback from 2.4.1 to 2.2.0
sam liu created HDFS-7114: - Summary: Secondary NameNode failed to rollback from 2.4.1 to 2.2.0 Key: HDFS-7114 URL: https://issues.apache.org/jira/browse/HDFS-7114 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.2.0 Reporter: sam liu Priority: Blocker Can upgrade from 2.2.0 to 2.4.1, but failed to rollback the secondary namenode with following issue. 2014-09-22 10:41:28,358 FATAL org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenode org.apache.hadoop.hdfs.server.common.IncorrectVersionException: Unexpected version of storage directory /var/hadoop/tmp/hdfs/dfs/namesecondary. Reported: -56. Expecting = -47. at org.apache.hadoop.hdfs.server.common.Storage.setLayoutVersion(Storage.java:1082) at org.apache.hadoop.hdfs.server.common.Storage.setFieldsFromProperties(Storage.java:890) at org.apache.hadoop.hdfs.server.namenode.NNStorage.setFieldsFromProperties(NNStorage.java:585) at org.apache.hadoop.hdfs.server.common.Storage.readProperties(Storage.java:921) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:913) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:249) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:199) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:652) 2014-09-22 10:41:28,360 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2014-09-22 10:41:28,363 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7585) TestEnhancedByteBufferAccess hard code the block size
sam liu created HDFS-7585: - Summary: TestEnhancedByteBufferAccess hard code the block size Key: HDFS-7585 URL: https://issues.apache.org/jira/browse/HDFS-7585 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 2.6.0 Reporter: sam liu Assignee: sam liu Priority: Blocker The test TestEnhancedByteBufferAccess hard code the block size, and it fails with exceptions on power linux. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7624) TestFileAppendRestart hardcode block size without considering native OS
sam liu created HDFS-7624: - Summary: TestFileAppendRestart hardcode block size without considering native OS Key: HDFS-7624 URL: https://issues.apache.org/jira/browse/HDFS-7624 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestFileAppendRestart hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7625) TestPersistBlocks hardcode block size without considering native OS
sam liu created HDFS-7625: - Summary: TestPersistBlocks hardcode block size without considering native OS Key: HDFS-7625 URL: https://issues.apache.org/jira/browse/HDFS-7625 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestPersistBlocks hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7626) TestPipelinesFailover hardcode block size without considering native OS
sam liu created HDFS-7626: - Summary: TestPipelinesFailover hardcode block size without considering native OS Key: HDFS-7626 URL: https://issues.apache.org/jira/browse/HDFS-7626 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestPipelinesFailover hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7627) TestCacheDirectives hardcode block size without considering native OS
sam liu created HDFS-7627: - Summary: TestCacheDirectives hardcode block size without considering native OS Key: HDFS-7627 URL: https://issues.apache.org/jira/browse/HDFS-7627 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestCacheDirectives hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7628) TestNameEditsConfigs hardcode block size without considering native OS
sam liu created HDFS-7628: - Summary: TestNameEditsConfigs hardcode block size without considering native OS Key: HDFS-7628 URL: https://issues.apache.org/jira/browse/HDFS-7628 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestNameEditsConfigs hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7629) TestDisableConnCache hardcode block size without considering native OS
sam liu created HDFS-7629: - Summary: TestDisableConnCache hardcode block size without considering native OS Key: HDFS-7629 URL: https://issues.apache.org/jira/browse/HDFS-7629 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu TestDisableConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7630) TestConnCache hardcode block size without considering native OS
sam liu created HDFS-7630: - Summary: TestConnCache hardcode block size without considering native OS Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)