Yes, IAM roles are actually required now for EMR. If you use Spark on EMR (vs. just EC2), you get S3 configuration for free (it goes by the name EMRFS), and it will use your IAM role for communicating with S3. Here is the corresponding documentation: http://docs.aws.amazon.com/ElasticMapReduce/latest/ManagementGuide/emr-fs.html
On Mon, Jan 11, 2016 at 11:37 AM Matei Zaharia <matei.zaha...@gmail.com> wrote: > In production, I'd recommend using IAM roles to avoid having keys > altogether. Take a look at > http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html > . > > Matei > > On Jan 11, 2016, at 11:32 AM, Sabarish Sasidharan < > sabarish.sasidha...@manthan.com> wrote: > > If you are on EMR, these can go into your hdfs site config. And will work > with Spark on YARN by default. > > Regards > Sab > On 11-Jan-2016 5:16 pm, "Krishna Rao" <krishnanj...@gmail.com> wrote: > >> Hi all, >> >> Is there a method for reading from s3 without having to hard-code keys? >> The only 2 ways I've found both require this: >> >> 1. Set conf in code e.g.: >> sc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "<aws_key>") >> sc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", >> "<aws_secret_key>") >> >> 2. Set keys in URL, e.g.: >> sc.textFile("s3n://<aws_key>@<aws_secret_key>/bucket/test/testdata") >> >> >> Both if which I'm reluctant to do within production code! >> >> >> Cheers >> > >