Barisa created FLINK-29382: ------------------------------ Summary: Flink fails to start when created using quick guide for flink operator Key: FLINK-29382 URL: https://issues.apache.org/jira/browse/FLINK-29382 Project: Flink Issue Type: Bug Components: Kubernetes Operator Affects Versions: 1.15.2 Reporter: Barisa
I followed [https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/try-flink-kubernetes-operator/quick-start/] to deploy flink operator and then the flink job. When following step {{kubectl create -f https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-1.1/examples/basic.yaml}} the pod starts, but then it keeps crashing with following exception. {noformat} Caused by: io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: User "system:anonymous" cannot watch resource "pods" in API group "" in the namespace "zonda" at io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener.onFailure(WatcherWebSocketListener.java:74) ~[flink-dist-1.15.2.jar:1.15.2] at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:570) ~[flink-dist-1.15.2.jar:1.15.2] at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:199) ~[flink-dist-1.15.2.jar:1.15.2] at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:174) ~[flink-dist-1.15.2.jar:1.15.2] at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[flink-dist-1.15.2.jar:1.15.2] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] {noformat} I also noticed following log lines {noformat} 2022-09-21 13:32:05,715 WARN io.fabric8.kubernetes.client.Config [] - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. 2022-09-21 13:32:05,719 WARN io.fabric8.kubernetes.client.Config [] - Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring. {noformat} I think the problem is that container runs as user root, which later uses gosu to became flink user. However, service account is only accessible to the main user in the container, which is root {noformat} root@basic-example-658578895d-qwlb2:/opt/flink# ls -hltr /var/run/secrets/kubernetes.io/serviceaccount/token lrwxrwxrwx. 1 root 1337 12 Sep 21 08:57 /var/run/secrets/kubernetes.io/serviceaccount/token -> ..data/token {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)