I figured it out. I had to add this line to the script:
sc._jsc.hadoopConfiguration().set("fs.s3.canned.acl",
"BucketOwnerFullControl")
Bascially, I had to get the JavaSparkContext in the SparkContext to access
the Hadoop configuration to set the permissions.
Follow up question: Is there a bett
You could try adding the configuration in the spark-defaults.conf file. And
once you run the application you can actually check on the driver UI (runs
on 4040) Environment tab to see if the configuration is set properly.
Thanks
Best Regards
On Thu, Jun 4, 2015 at 8:40 PM, Justin Steigel wrote:
Hi all,
I'm running Spark on AWS EMR and I'm having some issues getting the correct
permissions on the output files using
rdd.saveAsTextFile(''). In hive, I would add a line in the
beginning of the script with
set fs.s3.canned.acl=BucketOwnerFullControl
and that would set the correct grantees f