So when I go to ~/ephemeral-hdfs/bin/hadoop and check its version, it says Hadoop 2.0.0-cdh4.2.0. If I run pyspark and use the s3a address, things should work, right? What am I missing? And thanks so much for the help so far! ________________________________________ From: Steve Loughran [ste...@hortonworks.com] Sent: Thursday, July 23, 2015 11:37 AM To: Ewan Leith Cc: Greg Anderson; user@spark.apache.org Subject: Re: Help accessing protected S3
> On 23 Jul 2015, at 01:50, Ewan Leith <ewan.le...@realitymine.com> wrote: > > I think the standard S3 driver used in Spark from the Hadoop project (S3n) > doesn't support IAM role based authentication. > > However, S3a should support it. If you're running Hadoop 2.6 via the > spark-ec2 scripts (I'm not sure what it launches with by default) try > accessing your bucket via s3a:// URLs instead of s3n:// > > https://urldefense.proofpoint.com/v2/url?u=http-3A__wiki.apache.org_hadoop_AmazonS3&d=BQIFAg&c=z0adcvxXWKG6LAMN6dVEqQ&r=TXFP_8rvYC4ohugIkhrrFPmUS_nnsbQ8vpthF7R33uOSRRwJese6dYIL9RXf6vRA&m=zrgrOb7igpIaLCAUeqWBw38o-ile1wpQ-Rpkzcn55fw&s=x_xKm5UJLAPhVxZQn5uDamLF44NRGFebYBBNJwZx__A&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_HADOOP-2D10400&d=BQIFAg&c=z0adcvxXWKG6LAMN6dVEqQ&r=TXFP_8rvYC4ohugIkhrrFPmUS_nnsbQ8vpthF7R33uOSRRwJese6dYIL9RXf6vRA&m=zrgrOb7igpIaLCAUeqWBw38o-ile1wpQ-Rpkzcn55fw&s=NnHu6qQmQ-AQmpS-UPTf6IPF31ncTJVSPMqq_xfkDM0&e= > > Thanks, > Ewan > s3a should support roles. note that it isn't ready for production use before Hadoop 2.7.1, various scaiability and performance problems surfaced after 2.6 shipped --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org