Re: Spark and S3 server side encryption

2015-01-27 Thread Thomas Demoor
Spark uses the Hadoop filesystems. I assume you are trying to use s3n:// which, under the hood, uses the 3rd party jets3t library. It is configured through the jets3t.properties file (google "hadoop s3n jets3t") which you should put on Spark's classpath. The setting you are looking for is s3servic

Re: SaveAsTextFile to S3 bucket

2015-01-27 Thread Thomas Demoor
object. It has no effect on the file "/dev/output" which is, as far as S3 cares, another object that happens to share part of the objectname with /dev. Thomas Demoor skype: demoor.thomas mobile: +32 497883833 On Tue, Jan 27, 2015 at 6:33 AM, Chen, Kevin wrote: > When spark saves rdd

Re: performance of saveAsTextFile moving files from _temporary

2015-01-28 Thread Thomas Demoor
final output by using a custom OutputCommitter which does not use a temporary location. Thomas Demoor skype: demoor.thomas mobile: +32 497883833 On Wed, Jan 28, 2015 at 3:54 AM, Josh Walton wrote: > I'm not sure how to confirm how the moving is happening, however, one of > the job

Re: Which OutputCommitter to use for S3?

2015-02-26 Thread Thomas Demoor
FYI. We're currently addressing this at the Hadoop level in https://issues.apache.org/jira/browse/HADOOP-9565 Thomas Demoor On Mon, Feb 23, 2015 at 10:16 PM, Darin McBeath wrote: > Just to close the loop in case anyone runs into the same problem I had. > > By setting --hadoop-m