It works pretty fine for me with the script comes with 1.2.0 release. Here's a few things which you can try:
- Add your s3 credentials to the core-site.xml <property> <name>fs.s3.awsAccessKeyId</name> <value>ID</value></property><property> <name>fs.s3.awsSecretAccessKey</name> <value>SECRET</value></property> - Do a jps and see all services are up and running (Namenode, SecondaryNamenode, Datanodes etc.) I think the default hdfs port that comes with spark-ec2 is 9000. You can check that in your core-site.xml file Thanks Best Regards On Fri, Mar 6, 2015 at 7:14 AM, roni <roni.epi...@gmail.com> wrote: > Hi , > I used spark-ec2 script to create ec2 cluster. > > Now I am trying copy data from s3 into hdfs. > I am doing this > *root@ip-172-31-21-160 ephemeral-hdfs]$ bin/hadoop distcp > s3://<xxx>/home/mydata/small.sam > hdfs://ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1 > <http://ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1>* > > and I get following error - > > 2015-03-06 01:39:27,299 INFO tools.DistCp (DistCp.java:run(109)) - Input > Options: DistCpOptions{atomicCommit=false, syncFolder=false, > deleteMissing=false, ignoreFailures=false, maxMaps=20, > sslConfigurationFile='null', copyStrategy='uniformsize', > sourceFileListing=null, sourcePaths=[s3://<xxX>/home/mydata/small.sam], > targetPath=hdfs:// > ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9010/data1} > 2015-03-06 01:39:27,585 INFO mapreduce.Cluster > (Cluster.java:initialize(114)) - Failed to use > org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid > "mapreduce.jobtracker.address" configuration value for LocalJobRunner : " > ec2-52-11-148-31.us-west-2.compute.amazonaws.com:9001" > 2015-03-06 01:39:27,585 ERROR tools.DistCp (DistCp.java:run(126)) - > Exception encountered > java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name and the correspond server > addresses. > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121) > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83) > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76) > at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:352) > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:146) > at org.apache.hadoop.tools.DistCp.run(DistCp.java:118) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.tools.DistCp.main(DistCp.java:374) > > I tried doing start-all.sh , start-dfs.sh and start-yarn.sh > > what should I do ? > Thanks > -roni > >