> admission webhook "validation.gatekeeper.sh" denied the request:
First, your error is not related to Spark; it's related to your cluster policies created by Gatekeeper <https://open-policy-agent.github.io/gatekeeper/website/>. However, yes, you can fix it by providing a pod template to the Spark driver and executor pods. I noticed that the pod template you provided is not valid. A pod template overwrites the default configuration for Spark pods at the pod level, including specifying fields such as pod annotations, labels, volumes, and other pod-level configurations. That said, properties like livenessProbe and readinessProbe are container-specific configurations rather than pod-level properties. To achieve your goal, you will need to adjust the template to include something like the following: apiVersion: v1 kind: Pod spec: securityContext: # Pod-level security context fsGroup: 2000 containers: - name: spark-kubernetes-driver securityContext: # Container-level security context capabilities: drop: - MKNOD - KILL - SYS_CHROOT runAsUser: 1001 runAsGroup: 101 livenessProbe: failureThreshold: 3 exec: command: - touch - /tmp/healthy initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 readinessProbe: failureThreshold: 3 exec: command: - touch - /tmp/healthy initialDelaySeconds: 60 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 On Fri, Jan 3, 2025 at 10:19 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Could you file an official Spark JIRA issue with those reproducers (your > CLI command, YAML files, error messages)? > > In addition, it would be helpful if you can describe how to set up your > K8s clusters to make sure that it's not a K8s cluster issue. > > After having a JIRA, we can continue on that JIRA issue. > > BTW, Spark Driver pod is a normal pod. So, you had better start with a > successful POD YAML file first before using `spark-submit`. > > In other words, please attach a successful sample `busybox`-image POD YAML > file and (which passes all validations from on your K8s cluster). > > Thanks, > Dongjoon. > > > On Fri, Jan 3, 2025 at 12:35 PM jilani shaik <jilani2...@gmail.com> wrote: > >> Thanks, Dongjoon for the details. >> >> almost similar yaml file I have added for the template reference file and >> I am getting below error >> >> Exception in thread "main" >> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: >> POST at: https://k8sURL/api/v1/namespaces/namespace1/pods >> <https://k8surl/api/v1/namespaces/namespace1/pods>. Message: Forbidden! >> User user123 doesn't have permission. admission webhook " >> validation.gatekeeper.sh" denied the request: [must-have-probes] >> Container <spark-kubernetes-driver> in your <Pod> >> <spark-pi-bd7ea2942dd485c3-driver> has no <livenessProbe> >> >> [must-have-probes] Container <spark-kubernetes-driver> in your <Pod> >> <spark-pi-bd7ea2942dd485c3-driver> has no <readinessProbe> >> >> [psp-pods-allowed-user-ranges] Container spark-kubernetes-driver is >> attempting to run without a required securityContext/runAsUser >> >> [restricted-capabilities] container <spark-kubernetes-driver> is not >> dropping all required capabilities. Container must drop all of ["KILL", >> "MKNOD", "SYS_CHROOT"] or "ALL" >> >> [k8s-emptydir-size] emptyDir volume <spark-local-dir-1> must have a size >> limit. >> >> at >> io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:518) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108) >> >> at >> io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92) >> >> at >> org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153) >> >> at >> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256) >> >> at >> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250) >> >> at >> org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48) >> >> at >> org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46) >> >> at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94) >> >> at >> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250) >> >> at >> org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223) >> >> at org.apache.spark.deploy.SparkSubmit.org >> <http://org.apache.spark.deploy.sparksubmit.org/> >> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029) >> >> at >> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194) >> >> at >> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217) >> >> at >> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) >> >> at >> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120) >> >> at >> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129) >> >> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> >> >> >> >> >> my spark-submit command like this along with the pod template file >> >> >> >> spark-submit --verbose --master k8s://https://k8surl --deploy-mode >> cluster --name spark-pi --properties-file location1/spark-defaults.conf >> --num-executors 5 --conf >> spark.kubernetes.authenticate.driver.serviceAccountName=user123 --conf >> spark.kubernetes.namespace=namespace1 --conf >> spark.kubernetes.authenticate.caCertFile=location1/ca.crt --conf >> spark.kubernetes.authenticate.oauthTokenFile=location1/token1 --conf >> spark.kubernetes.authenticate.submission.caCertFile=location1/ca.crt --conf >> spark.kubernetes.authenticate.submission.oauthTokenFile=location1/token1 >> --conf spark.kubernetes.file.upload.path=/tmp/ --conf >> spark.kubernetes.driver.podTemplateFile=location1/k8s/spark/driver.yaml >> --conf >> spark.kubernetes.executor.podTemplateFile=location1/k8s/spark/driver.yaml >> --class org.apache.spark.examples.SparkPi >> spark-3.5.3-bin-hadoop3/examples/jars/spark-examples_2.12-3.5.3.jar 100 >> >> >> >> my template yaml file has the same as provided in Apache Spark github url >> with below additional details >> >> >> >> >> >> livenessProbe: >> failureThreshold: 3 >> exec: >> command: >> - touch >> - /tmp/healthy >> initialDelaySeconds: 60 >> periodSeconds: 10 >> successThreshold: 1 >> timeoutSeconds: 1 >> readinessProbe: >> failureThreshold: 3 >> exec: >> command: >> - touch >> - /tmp/healthy >> initialDelaySeconds: 60 >> periodSeconds: 10 >> successThreshold: 1 >> timeoutSeconds: 1 >> >> >> >> >> >> and restricted security context capabilities to pods via yaml file >> including run as user. >> >> >> securityContext:(under container level yaml entry) >> >> capabilities: >> >> drop: >> >> - MKNOD >> >> - KILL >> >> - SYS_CHROOT >> >> >> securityContext:(same as container level yaml entry) >> >> fsGroup: 2000 >> >> runAsGroup: 101 >> >> runAsUser: 1001 >> >> >> Thanks, >> >> Jilani >> >> >> >> On Fri, Jan 3, 2025 at 12:41 PM Dongjoon Hyun <dongjoon.h...@gmail.com> >> wrote: >> >>> Could you elaborate what you mean by `not working`? >>> >>> > but it's not working. >>> >>> For the following question, Spark expects a normal Pod YAML file. >>> You may want to take a look at the Apache Spark GitHub repository. >>> >>> > I do not have a sample template file >>> >>> For example, the following files are used during K8s integration tests. >>> >>> >>> https://github.com/apache/spark/tree/master/resource-managers/kubernetes/integration-tests/src/test/resources >>> >>> 1. driver-schedule-template.yml >>> 2. driver-template.yml >>> 3. executor-template.yml >>> >>> Dongjoon. >>> >>> On Thu, Jan 2, 2025 at 12:07 PM jilani shaik <jilani2...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> I am trying to run Spark on the Kubernetes cluster, but that cluster >>>> has certain validation to deploy any pod that is not allowing me to run my >>>> Spark submit. >>>> >>>> for example, I need to add liveness, readiness probes and certain >>>> security capability restrictions, which we usually do for all outer pods >>>> via yaml file. >>>> >>>> not sure how to get that in Spark submit k8s. I tried the driver and >>>> executor template file, but it's not working. at the same time, I do not >>>> have a sample template file from the documentation except below lines >>>> >>>> --conf spark.kubernetes.driver.podTemplateFile=s3a://bucket/driver.yml >>>> --conf spark.kubernetes.executor.podTemplateFile=s3a://bucket/executor.yml >>>> >>>> >>>> Can some one provide directions how to proceed further. >>>> >>>> Thanks, >>>> Jilani >>>> >>>