Re: Issue with accessing S3 from EKS spark pod

Rishabh Jain Wed, 10 Feb 2021 09:52:39 -0800

Seemed like I was not able connect to sts.amazonaws.com. Fixed that error.
Now spark write to s3 is able to create folder structure on s3 but on final
file write it fails with below big error:


org.apache.spark.SparkException: Job aborted.

at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:226)

at
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:178)

at
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:108)

at
org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:106)

at
org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:131)

at
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:175)

at
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:213)

at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:210)

at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:171)

at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:122)

at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:121)

at
org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:963)

at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)

at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)

at
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)

at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764)

at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)

at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:963)

at
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:415)

at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:399)

at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:288)

at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:897)

Exception occurred while running transaction extracts job: Job aborted.

at com.gpn.batch.writer.S3Writer.write(S3Writer.java:9)

at com.gpn.batch.PostedTransactionsJob.main(PostedTransactionsJob.java:47)

at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)

at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)

at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.base/java.lang.reflect.Method.invoke(Method.java:564)

at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)

at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)

at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)

at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)

at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)

at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task 1 in stage 6.0 failed 4 times, most recent failure: Lost task
1.3 in stage 6.0 (TID 17, 10.37.2.40, executor 1):
java.nio.file.AccessDeniedException:
s3a://gpn-corebatch-posting-extracts/totals-extract-1612978376492/_temporary/0/_temporary/attempt_20210210173339_0006_m_000001_17/part-00001-43be031c-5f3d-4b4f-bd2d-dc19ed99c7b4-c000.txt:
getFileStatus on
s3a://gpn-corebatch-posting-extracts/totals-extract-1612978376492/_temporary/0/_temporary/attempt_20210210173339_0006_m_000001_17/part-00001-43be031c-5f3d-4b4f-bd2d-dc19ed99c7b4-c000.txt:
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
86B9CEF5EDA607F8; S3 Extended Request ID:
1XOprWwxqw0OV9mhb4wFkB3cOhwcI/kaFHctXEgGaovT8VTRWjnW6DwaMyO0laeCNUmn1nTbQYY=;
Proxy: null), S3 Extended Request ID:
1XOprWwxqw0OV9mhb4wFkB3cOhwcI/kaFHctXEgGaovT8VTRWjnW6DwaMyO0laeCNUmn1nTbQYY=:403
Forbidden

at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:230)

at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:151)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2198)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2163)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2102)

at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:752)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)

at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987)

at
org.apache.spark.sql.execution.datasources.CodecStreams$.createOutputStream(CodecStreams.scala:81)

at
org.apache.spark.sql.execution.datasources.text.TextOutputWriter.<init>(TextOutputWriter.scala:33)

at
org.apache.spark.sql.execution.datasources.text.TextFileFormat$$anon$1.newInstance(TextFileFormat.scala:84)

at
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:126)

at
org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:111)

at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:264)

at
org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)

at org.apache.spark.scheduler.Task.run(Task.scala:127)

at
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446)

at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449)

at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)

at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)

at java.base/java.lang.Thread.run(Thread.java:832)

Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden
(Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request
ID: 86B9CEF5EDA607F8; S3 Extended Request ID:
1XOprWwxqw0OV9mhb4wFkB3cOhwcI/kaFHctXEgGaovT8VTRWjnW6DwaMyO0laeCNUmn1nTbQYY=;
Proxy: null), S3 Extended Request ID:
1XOprWwxqw0OV9mhb4wFkB3cOhwcI/kaFHctXEgGaovT8VTRWjnW6DwaMyO0laeCNUmn1nTbQYY=

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1819)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1403)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1372)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)

at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)

at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)

at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)

at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5259)

at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5206)

at
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1360)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1249)

at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)

at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1246)

at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2183)

... 21 more


Can someone help me with this issue? If it is the IAM permission issue,
then what permission might be missing that I am getting this issue. If not
then what is the root cause?


*Thanks,*

Rishabh Jain
Application Developer
Email rishabh.j...@thoughtworks.com
Telephone +91 6264277897 <+91+626+427+7897>
[image: ThoughtWorks]
<http://www.thoughtworks.com/?utm_campaign=prajwal-boloor-signature&utm_medium=email&utm_source=thoughtworks-email-signature-generator>




On Wed, Feb 10, 2021 at 2:26 PM Vladimir Prus <vladimir.p...@gmail.com>
wrote:

> Hi,
>
> the fsGroup setting should match the id Spark is running at. When building
> from source, that id is 185, and you can use "docker inspect <image-name>"
> to double-check.
>
> On Wed, Feb 10, 2021 at 11:43 AM Rishabh Jain <
> rishabh.j...@thoughtworks.com> wrote:
>
>> Hi,
>>
>> I tried doing what Vladimir suggested. But no luck there either. My guess
>> is that it has something to do with securityContext.fsGroup. I am trying to
>> pass yaml file path along with spark submit command. My yaml file content
>> is
>> ```
>>
>> apiVersion: v1
>>
>> kind: Pod
>>
>> spec:
>>
>>   securityContext:
>>
>>     fsGroup: 65534
>>
>>   serviceAccount: <service accoun>
>>
>>   serviceAccountName: <service account name>
>>
>> ```
>>
>>
>> Is there anything wrong with this yaml file?
>>
>>
>> ~
>> *Thanks,*
>>
>> Rishabh Jain
>> Application Developer
>> Email rishabh.j...@thoughtworks.com
>> Telephone +91 6264277897 <+91+626+427+7897>
>> [image: ThoughtWorks]
>> <http://www.thoughtworks.com/?utm_campaign=prajwal-boloor-signature&utm_medium=email&utm_source=thoughtworks-email-signature-generator>
>>
>>
>>
>>
>> On Tue, Feb 9, 2021 at 10:44 PM Vladimir Prus <vladimir.p...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> On 9 Feb 2021, at 19:46, Rishabh Jain <rishabh.j...@thoughtworks.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> We are trying to access S3 from spark job running on EKS cluster pod. I
>>> have a service account that has an IAM role attached with full S3
>>> permission. We are using DefaultCredentialsProviderChain.  But still we are
>>> getting 403 Forbidden from S3.
>>>
>>>
>>> It’s hard to say without any information, but some things you might want
>>> to double-check
>>>
>>> - Make sure the Spark job is using sufficiently new AWS SDK, so that IAM
>>> for service account is supported
>>> - Modify your job to print the effective role, e.g.
>>>
>>>     val stsClient =
>>> AWSSecurityTokenServiceClientBuilder.standard().build();
>>>     val request = new GetCallerIdentityRequest()
>>>     val identity = stsClient.getCallerIdentity(request)
>>>     println(identity.getArn())
>>>
>>> - If the above does not print the expected role, verify that the pods
>>> actually have the right service account, and
>>> that  AWS_ROLE_ARN/AWS_WEB_IDENTITY_TOKEN_FILE variables are set on the
>>> pod, and that
>>>   the assume policy for the role does allow EKS to assume that role.
>>> - If the above prints the expected role, then 403 error means you did
>>> not setup IAM policies on your role/bucket.
>>>
>>>
>>> Is there anything wrong with our approach?
>>>
>>> Generally speaking, IAM for service accounts in EKS + Spark works, it's
>>> just there's a lot of things that can go wrong the first time you do it.
>>>
>>>
>>> HTH,
>>>
>>
>
> --
> Vladimir Prus
> http://vladimirprus.com
>

Re: Issue with accessing S3 from EKS spark pod

Reply via email to