So when I go to ~/ephemeral-hdfs/bin/hadoop and check its version, it says 
Hadoop 2.0.0-cdh4.2.0.  If I run pyspark and use the s3a address, things should 
work, right?  What am I missing?  And thanks so much for the help so far!
________________________________________
From: Steve Loughran [ste...@hortonworks.com]
Sent: Thursday, July 23, 2015 11:37 AM
To: Ewan Leith
Cc: Greg Anderson; user@spark.apache.org
Subject: Re: Help accessing protected S3

> On 23 Jul 2015, at 01:50, Ewan Leith <ewan.le...@realitymine.com> wrote:
>
> I think the standard S3 driver used in Spark from the Hadoop project (S3n) 
> doesn't support IAM role based authentication.
>
> However, S3a should support it. If you're running Hadoop 2.6 via the 
> spark-ec2 scripts (I'm not sure what it launches with by default) try 
> accessing your bucket via s3a:// URLs instead of s3n://
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_hadoop_AmazonS3&d=BQIFAg&c=z0adcvxXWKG6LAMN6dVEqQ&r=TXFP_8rvYC4ohugIkhrrFPmUS_nnsbQ8vpthF7R33uOSRRwJese6dYIL9RXf6vRA&m=zrgrOb7igpIaLCAUeqWBw38o-ile1wpQ-Rpkzcn55fw&s=x_xKm5UJLAPhVxZQn5uDamLF44NRGFebYBBNJwZx__A&e=
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D10400&d=BQIFAg&c=z0adcvxXWKG6LAMN6dVEqQ&r=TXFP_8rvYC4ohugIkhrrFPmUS_nnsbQ8vpthF7R33uOSRRwJese6dYIL9RXf6vRA&m=zrgrOb7igpIaLCAUeqWBw38o-ile1wpQ-Rpkzcn55fw&s=NnHu6qQmQ-AQmpS-UPTf6IPF31ncTJVSPMqq_xfkDM0&e=
>
> Thanks,
> Ewan
>

s3a should support roles. note that it isn't ready for production use before 
Hadoop 2.7.1, various scaiability and performance problems surfaced after 2.6 
shipped

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to