When you use hadoopConfiguration directly, I don’t think you have to replace the “/“ with “%2f”. Have you tried it without that? Also make sure you’re not replacing slashes in the URL itself.
Matei On Jul 2, 2014, at 4:17 PM, Brian Gawalt <bgaw...@gmail.com> wrote: > Hello everyone, > > I'm having some difficulty reading from my company's private S3 buckets. > I've got an S3 access key and secret key, and I can read the files fine from > a non-Spark Scala routine via AWScala <http://github.com/seratch/AWScala> > . But trying to read them with the SparkContext.textFiles([comma separated > s3n://bucket/key uris]) leads to the following stack trace (where I've > changed the object key to use terms 'subbucket' and 'datafile-' for privacy > reasons: > > [error] (run-main-0) org.apache.hadoop.fs.s3.S3Exception: > org.jets3t.service.S3ServiceException: S3 HEAD request failed for > '/subbucket%2F2014%2F01%2Fdatafile-01.gz' - ResponseCode=403, > ResponseMessage=Forbidden > org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: > S3 HEAD request failed for '/ja_quick_info%2F2014%2F01%2Fapplied-01.gz' - > ResponseCode=403, ResponseMessage=Forbidden > at > org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryI > [... etc ...] > > I'm handing off the credentials themselves via the following method: > > def cleanKey(s: String): String = s.replace("/", "%2F") > > val sc = new SparkContext("local[8]", "SparkS3Test") > > sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", > cleanKey(creds.accessKeyId)) > sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", > cleanKey(creds.secretAccessKey)) > > The comma-separated URIs themselves look each look like: > > s3n://odesk-bucket-name/subbucket/2014/01/datafile-01.gz > > The actual string that I've replaced with 'subbucket' includes underscores > but otherwise is just straight ASCII; the term 'datafile' is substituting is > also just straight ASCII. > > This is using Spark 1.0.0, via a library dependency to sbt of: > "org.apache.spark" % "spark-core_2.10" % "1.0.0" > > Any tips appeciated! > Thanks much, > -Brian > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/AWS-Credentials-for-private-S3-reads-tp8687.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.