I am on k8s 1.17 in a small 4 node cluster. I am running Spark 2.4.4 but
with
updated kubernetes-client jars to work around the 403 CVE issue.
I am running on a pod in the 'default' namespace of my cluster in a Jupyter
notebook. I am trying to configure 'client mode' so I can use pyspark
interactively and watch work done on the executors.
Here is my SparkConf:
sparkConf = SparkConf()
sparkConf.setMaster("k8s://https://192.168.0.100:6443")
sparkConf.setAppName("pispark")
sparkConf.set("spark.kubernetes.container.image",
"pidocker-docker-registry:5000/my-spark-py:v2.4.4")
sparkConf.set("spark.kubernetes.namespace", "spark")
sparkConf.set("spark.executor.instances", "3")
sparkConf.set("spark.driver.memory", "512m")
sparkConf.set("spark.executor.memory", "512m")
sparkConf.set("spark.kubernetes.pyspark.pythonVersion", 3)
sparkConf.set("spark.kubernetes.authenticate.driver.serviceAccountName",
"spark")
sparkConf.set("spark.kubernetes.authenticate.serviceAccountName", "spark")
sparkConf.set("spark.kubernetes.pullSecrets",
"pidocker-docker-registry-secret")
spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
sc = spark.sparkContext
The problem is when spark initailizes I see the following error:
io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden:
User "system:serviceaccount:default:default" cannot watch resource "pods" in
API group "" in the namespace "spark"
But I am not using "default:default" I am using "spark:spark" which has
"edit" access via a clusterrolebinding in that namespace:
$ k describe clusterrolebinding/spark-role -n spark
Name: spark-role
Labels: <none>
Annotations: <none>
Role:
Kind: ClusterRole
Name: edit
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount spark spark
What am I doing wrong?
-aps