Service Account not being honored using pyspark on Kubernetes

pisymbol . Wed, 29 Jan 2020 14:04:13 -0800

I am on k8s 1.17 in a small 4 node cluster. I am running Spark 2.4.4 but
with
updated kubernetes-client jars to work around the 403 CVE issue.


I am running on a pod in the 'default' namespace of my cluster in a Jupyter
notebook. I am trying to configure 'client mode' so I can use pyspark
interactively and watch work done on the executors.

Here is my SparkConf:

sparkConf = SparkConf()
sparkConf.setMaster("k8s://https://192.168.0.100:6443";)
sparkConf.setAppName("pispark")
sparkConf.set("spark.kubernetes.container.image",
"pidocker-docker-registry:5000/my-spark-py:v2.4.4")
sparkConf.set("spark.kubernetes.namespace", "spark")
sparkConf.set("spark.executor.instances", "3")
sparkConf.set("spark.driver.memory", "512m")
sparkConf.set("spark.executor.memory", "512m")
sparkConf.set("spark.kubernetes.pyspark.pythonVersion", 3)
sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName",
"spark")
sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark")
sparkConf.set("spark.kubernetes.pullSecrets",
"pidocker-docker-registry-secret")

spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
sc = spark.sparkContext

The problem is when spark initailizes I see the following error:

io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden:
User "system:serviceaccount:default:default" cannot watch resource "pods" in
API group "" in the namespace "spark"

But I am not using "default:default" I am using "spark:spark" which has
"edit" access via a clusterrolebinding in that namespace:

$  k describe clusterrolebinding/spark-role -n spark
Name:         spark-role
Labels:       <none>
Annotations:  <none>
Role:
  Kind:  ClusterRole
  Name:  edit
Subjects:
  Kind            Name   Namespace
  ----            ----   ---------
  ServiceAccount  spark  spark

What am I doing wrong?

-aps

Service Account not being honored using pyspark on Kubernetes

Reply via email to