from:"Justin Steigel"

Setting S3 output file grantees for spark output files

2015-06-04 Thread Justin Steigel

Hi all, I'm running Spark on AWS EMR and I'm having some issues getting the correct permissions on the output files using rdd.saveAsTextFile(''). In hive, I would add a line in the beginning of the script with set fs.s3.canned.acl=BucketOwnerFullControl and that would set the correct grantees f

Re: Setting S3 output file grantees for spark output files

2015-06-05 Thread Justin Steigel

the spark-defaults.conf file. > And once you run the application you can actually check on the driver UI > (runs on 4040) Environment tab to see if the configuration is set properly. > > Thanks > Best Regards > > On Thu, Jun 4, 2015 at 8:40 PM, Justin Steigel > wrote: > >&

Spark Python process

2015-06-24 Thread Justin Steigel

I have a spark job that's running on a 10 node cluster and the python process on all the nodes is pegged at 100%. I was wondering what parts of a spark script are run in the python process and which get passed to the Java processes? Is there any documentation on this? Thanks, Justin

Setting S3 output file grantees for spark output files

Re: Setting S3 output file grantees for spark output files

Spark Python process

3 matches

Site Navigation

Mail list logo

Footer information