Hi Sabarish

We finally got S3 working. I think the real problem was that by default
spark-ec2 uses an old version of hadoop (1.0.4). The we passed
--copy-aws-credentials --hadoop-major-version=2  it started working

Kind regards

Andy


From:  Sabarish Sasidharan <sabarish.sasidha...@manthan.com>
Date:  Sunday, February 14, 2016 at 7:05 PM
To:  Andrew Davidson <a...@santacruzintegration.com>
Cc:  "user @spark" <user@spark.apache.org>
Subject:  Re: newbie unable to write to S3 403 forbidden error

> 
> Make sure you are using s3 bucket in same region. Also I would access my
> bucket this way s3n://bucketname/foldername.
> 
> You can test privileges using the s3 cmd line client.
> 
> Also, if you are using instance profiles you don't need to specify access and
> secret keys. No harm in specifying though.
> 
> Regards
> Sab
> On 12-Feb-2016 2:46 am, "Andy Davidson" <a...@santacruzintegration.com> wrote:
>> I am using spark 1.6.0 in a cluster created using the spark-ec2 script. I am
>> using the standalone cluster manager
>> 
>> My java streaming app is not able to write to s3. It appears to be some for
>> of permission problem.
>> 
>> Any idea what the problem might be?
>> 
>> I tried use the IAM simulator to test the policy. Everything seems okay. Any
>> idea how I can debug this problem?
>> 
>> Thanks in advance
>> 
>> Andy
>> 
>>         JavaSparkContext jsc = new JavaSparkContext(conf);
>> 
>> 
>> // I did not include the full key in my email
>>        // the keys do not contain Œ\¹
>>        // these are the keys used to create the cluster. They belong to the
>> IAM user andy
>>         jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", "AKIAJREX");
>> 
>>         jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey",
>> "uBh9v1hdUctI23uvq9qR");
>> 
>> 
>> 
>> 
>>   private static void saveTweets(JavaDStream<String> jsonTweets, String
>> outputURI) {
>> 
>>         jsonTweets.foreachRDD(new VoidFunction2<JavaRDD<String>, Time>() {
>> 
>>             private static final long serialVersionUID = 1L;
>> 
>> 
>> 
>>             @Override
>> 
>>             public void call(JavaRDD<String> rdd, Time time) throws Exception
>> {
>> 
>>                 if(!rdd.isEmpty()) {
>> 
>>     // bucket name is Œcom.pws.twitter¹ it has a folder Œjson'
>> 
>>                     String dirPath =
>> "s3n://s3-us-west-1.amazonaws.com/com.pws.twitter/
>> <http://s3-us-west-1.amazonaws.com/com.pws.twitter/> json² + "-" +
>> time.milliseconds();
>> 
>>                     rdd.saveAsTextFile(dirPath);
>> 
>>                 }
>> 
>>             }
>> 
>>         });
>> 
>>         
>> 
>> 
>> Bucket name : com.pws.titter
>> Bucket policy (I replaced the account id)
>> 
>> {
>> "Version": "2012-10-17",
>> "Id": "Policy1455148808376",
>> "Statement": [
>> {
>> "Sid": "Stmt1455148797805",
>> "Effect": "Allow",
>> "Principal": {
>> "AWS": "arn:aws:iam::123456789012:user/andy"
>> },
>> "Action": "s3:*",
>> "Resource": "arn:aws:s3:::com.pws.twitter/*"
>> }
>> ]
>> }
>> 
>> 


Reply via email to