Re: Setting S3 output file grantees for spark output files

2015-06-05 Thread Justin Steigel
I figured it out. I had to add this line to the script: sc._jsc.hadoopConfiguration().set("fs.s3.canned.acl", "BucketOwnerFullControl") Bascially, I had to get the JavaSparkContext in the SparkContext to access the Hadoop configuration to set the permissions. Follow up question: Is there a bett

Re: Setting S3 output file grantees for spark output files

2015-06-05 Thread Akhil Das
You could try adding the configuration in the spark-defaults.conf file. And once you run the application you can actually check on the driver UI (runs on 4040) Environment tab to see if the configuration is set properly. Thanks Best Regards On Thu, Jun 4, 2015 at 8:40 PM, Justin Steigel wrote:

Setting S3 output file grantees for spark output files

2015-06-04 Thread Justin Steigel
Hi all, I'm running Spark on AWS EMR and I'm having some issues getting the correct permissions on the output files using rdd.saveAsTextFile(''). In hive, I would add a line in the beginning of the script with set fs.s3.canned.acl=BucketOwnerFullControl and that would set the correct grantees f