How do I use it? I'm accessing s3a from Spark's textFile API. On Tue, May 31, 2016 at 7:16 AM, Deepak Sharma <deepakmc...@gmail.com> wrote:
> Hi Mayuresh > Instead of s3a , have you tried the https:// uri for the same s3 bucket? > > HTH > Deepak > > On Tue, May 31, 2016 at 4:41 PM, Mayuresh Kunjir <mayur...@cs.duke.edu> > wrote: > >> >> >> On Tue, May 31, 2016 at 5:29 AM, Steve Loughran <ste...@hortonworks.com> >> wrote: >> >>> which s3 endpoint? >>> >>> >> I have tried both s3.amazonaws.com and s3-external-1.amazonaws.com. >> >> >>> >>> >>> On 29 May 2016, at 22:55, Mayuresh Kunjir <mayur...@cs.duke.edu> wrote: >>> >>> I'm running into permission issues while accessing data in S3 bucket >>> stored using s3a file system from a local Spark cluster. Has anyone found >>> success with this? >>> >>> My setup is: >>> - Spark 1.6.1 compiled against Hadoop 2.7.2 >>> - aws-java-sdk-1.7.4.jar and hadoop-aws-2.7.2.jar in the classpath >>> - Spark's Hadoop configuration is as follows: >>> >>> sc.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem") >>> sc.hadoopConfiguration.set("fs.s3a.access.key", <access>) >>> sc.hadoopConfiguration.set("fs.s3a.secret.key", <secret>) >>> (The secret key does not have any '/' characters which is reported to >>> cause some issue by others) >>> >>> I have configured my S3 bucket to grant the necessary permissions. ( >>> https://sparkour.urizone.net/recipes/configuring-s3/) >>> >>> What works: Listing, reading from, and writing to s3a using hadoop >>> command. e.g. hadoop dfs -ls s3a://<bucket name>/<file path> >>> >>> What doesn't work: Reading from s3a using Spark's textFile API. Each >>> task throws an exception which says *Forbidden Access(403)*. >>> >>> Some online documents suggest to use IAM roles to grant permissions for >>> an AWS cluster. But I would like a solution for my local standalone cluster. >>> >>> Any help would be appreciated. >>> >>> Regards, >>> ~Mayuresh >>> >>> >>> >> > > > -- > Thanks > Deepak > www.bigdatabig.com > www.keosha.net >