> admission webhook "validation.gatekeeper.sh" denied the request:

First, your error is not related to Spark; it's related to your cluster
policies created by Gatekeeper
<https://open-policy-agent.github.io/gatekeeper/website/>. However, yes,
you can fix it by providing a pod template to the Spark driver and executor
pods.

I noticed that the pod template you provided is not valid. A pod template
overwrites the default configuration for Spark pods at the pod level,
including specifying fields such as pod annotations, labels, volumes, and
other pod-level configurations.
That said, properties like livenessProbe and readinessProbe are
container-specific configurations rather than pod-level properties. To
achieve your goal, you will need to adjust the template to include
something like the following:

apiVersion: v1
kind: Pod
spec:
  securityContext: # Pod-level security context
    fsGroup: 2000
  containers:
    - name: spark-kubernetes-driver
      securityContext: # Container-level security context
        capabilities:
          drop:
            - MKNOD
            - KILL
            - SYS_CHROOT
        runAsUser: 1001
        runAsGroup: 101
      livenessProbe:
        failureThreshold: 3
        exec:
          command:
            - touch
            - /tmp/healthy
        initialDelaySeconds: 60
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1
      readinessProbe:
        failureThreshold: 3
        exec:
          command:
            - touch
            - /tmp/healthy
        initialDelaySeconds: 60
        periodSeconds: 10
        successThreshold: 1
        timeoutSeconds: 1


On Fri, Jan 3, 2025 at 10:19 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> Could you file an official Spark JIRA issue with those reproducers (your
> CLI command, YAML files, error messages)?
>
> In addition, it would be helpful if you can describe how to set up your
> K8s clusters to make sure that it's not a K8s cluster issue.
>
> After having a JIRA, we can continue on that JIRA issue.
>
> BTW, Spark Driver pod is a normal pod. So, you had better start with a
> successful POD YAML file first before using `spark-submit`.
>
> In other words, please attach a successful sample `busybox`-image POD YAML
> file and (which passes all validations from on your K8s cluster).
>
> Thanks,
> Dongjoon.
>
>
> On Fri, Jan 3, 2025 at 12:35 PM jilani shaik <jilani2...@gmail.com> wrote:
>
>> Thanks, Dongjoon for the details.
>>
>> almost similar yaml file I have added for the template reference file and
>> I am getting below error
>>
>> Exception in thread "main"
>> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing:
>> POST at: https://k8sURL/api/v1/namespaces/namespace1/pods
>> <https://k8surl/api/v1/namespaces/namespace1/pods>. Message: Forbidden!
>> User user123 doesn't have permission. admission webhook "
>> validation.gatekeeper.sh" denied the request: [must-have-probes]
>> Container <spark-kubernetes-driver> in your <Pod>
>> <spark-pi-bd7ea2942dd485c3-driver> has no <livenessProbe>
>>
>> [must-have-probes] Container <spark-kubernetes-driver> in your <Pod>
>> <spark-pi-bd7ea2942dd485c3-driver> has no <readinessProbe>
>>
>> [psp-pods-allowed-user-ranges] Container spark-kubernetes-driver is
>> attempting to run without a required securityContext/runAsUser
>>
>> [restricted-capabilities] container <spark-kubernetes-driver> is not
>> dropping all required capabilities. Container must drop all of ["KILL",
>> "MKNOD", "SYS_CHROOT"] or "ALL"
>>
>> [k8s-emptydir-size] emptyDir volume <spark-local-dir-1> must have a size
>> limit.
>>
>>        at
>> io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108)
>>
>>        at
>> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92)
>>
>>        at
>> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153)
>>
>>        at
>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256)
>>
>>        at
>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250)
>>
>>        at
>> org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)
>>
>>        at
>> org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)
>>
>>        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94)
>>
>>        at
>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250)
>>
>>        at
>> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223)
>>
>>        at org.apache.spark.deploy.SparkSubmit.org
>> <http://org.apache.spark.deploy.sparksubmit.org/>
>> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
>>
>>        at
>> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
>>
>>        at
>> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
>>
>>        at
>> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
>>
>>        at
>> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
>>
>>        at
>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
>>
>>        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>
>>
>>
>>
>> my spark-submit command like this along with the pod template file
>>
>>
>>
>> spark-submit --verbose --master k8s://https://k8surl --deploy-mode
>> cluster --name spark-pi --properties-file location1/spark-defaults.conf
>> --num-executors 5 --conf
>> spark.kubernetes.authenticate.driver.serviceAccountName=user123 --conf
>> spark.kubernetes.namespace=namespace1 --conf
>> spark.kubernetes.authenticate.caCertFile=location1/ca.crt --conf
>> spark.kubernetes.authenticate.oauthTokenFile=location1/token1 --conf
>> spark.kubernetes.authenticate.submission.caCertFile=location1/ca.crt --conf
>> spark.kubernetes.authenticate.submission.oauthTokenFile=location1/token1
>> --conf spark.kubernetes.file.upload.path=/tmp/ --conf
>> spark.kubernetes.driver.podTemplateFile=location1/k8s/spark/driver.yaml
>> --conf
>> spark.kubernetes.executor.podTemplateFile=location1/k8s/spark/driver.yaml
>> --class org.apache.spark.examples.SparkPi
>> spark-3.5.3-bin-hadoop3/examples/jars/spark-examples_2.12-3.5.3.jar 100
>>
>>
>>
>> my template yaml file has the same as provided in Apache Spark github url
>> with below additional details
>>
>>
>>
>>
>>
>> livenessProbe:
>>   failureThreshold: 3
>>   exec:
>>     command:
>>       - touch
>>       - /tmp/healthy
>>   initialDelaySeconds: 60
>>   periodSeconds: 10
>>   successThreshold: 1
>>   timeoutSeconds: 1
>> readinessProbe:
>>   failureThreshold: 3
>>   exec:
>>     command:
>>       - touch
>>       - /tmp/healthy
>>   initialDelaySeconds: 60
>>   periodSeconds: 10
>>   successThreshold: 1
>>   timeoutSeconds: 1
>>
>>
>>
>>
>>
>> and restricted security context capabilities to pods via yaml file
>> including run as user.
>>
>>
>> securityContext:(under container level yaml entry)
>>
>>   capabilities:
>>
>>     drop:
>>
>>        - MKNOD
>>
>>        - KILL
>>
>>        - SYS_CHROOT
>>
>>
>> securityContext:(same as container level yaml entry)
>>
>>   fsGroup: 2000
>>
>>   runAsGroup: 101
>>
>>   runAsUser: 1001
>>
>>
>> Thanks,
>>
>> Jilani
>>
>>
>>
>> On Fri, Jan 3, 2025 at 12:41 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
>> wrote:
>>
>>> Could you elaborate what you mean by `not working`?
>>>
>>> > but it's not working.
>>>
>>> For the following question, Spark expects a normal Pod YAML file.
>>> You may want to take a look at the Apache Spark GitHub repository.
>>>
>>> > I do not have a  sample template file
>>>
>>> For example, the following files are used during K8s integration tests.
>>>
>>>
>>> https://github.com/apache/spark/tree/master/resource-managers/kubernetes/integration-tests/src/test/resources
>>>
>>> 1. driver-schedule-template.yml
>>> 2. driver-template.yml
>>> 3. executor-template.yml
>>>
>>> Dongjoon.
>>>
>>> On Thu, Jan 2, 2025 at 12:07 PM jilani shaik <jilani2...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to run Spark on the Kubernetes cluster, but that cluster
>>>> has certain validation to deploy any pod that is not allowing me to run my
>>>> Spark submit.
>>>>
>>>> for example, I need to add liveness, readiness probes and certain
>>>> security capability restrictions, which we usually do for all outer pods
>>>> via yaml file.
>>>>
>>>> not sure how to get that in Spark submit k8s. I tried the driver and
>>>> executor template file, but it's not working. at the same time, I do not
>>>> have a  sample template file from the documentation except below lines
>>>>
>>>> --conf spark.kubernetes.driver.podTemplateFile=s3a://bucket/driver.yml
>>>> --conf spark.kubernetes.executor.podTemplateFile=s3a://bucket/executor.yml
>>>>
>>>>
>>>> Can some one provide directions how to proceed further.
>>>>
>>>> Thanks,
>>>> Jilani
>>>>
>>>

Reply via email to