ubject: Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes
Master Node
I think your attached exception has been fixed via FLINK-22597[1]. Could you
please have a try with the latest version.
Moreover, it is not the desired Flink behavior that TaskManager could not
retrieve the new
ache.flink.runtime.jobmaster.JobMaster.startJobMasterServices(JobMaster.java:891)
> ~[flink-dist_2.12-1.12.2.jar:1.12.2]
>
> at
> org.apache.flink.runtime.jobmaster.JobMaster.startJobExecution(JobMaster.java:864)
> ~[flink-dist_2.12-1.12.2.jar:1.12.2]
>
>at
behaviors? Or is it a bug? Or if I am missing
something?
Best,
Jerome
From: Yang Wang
Date: Tuesday, May 25, 2021 at 1:03 AM
To: Jerome Li
Cc: user@flink.apache.org
Subject: Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes
Master Node
By "restart master node", do y
By "restart master node", do you mean to restart the K8s master
component(e.g. APIServer, ETCD, etc.)?
Even though the master components are restarted, the Flink JobManager and
TaskManager should eventually get to work.
Could you please share the JobManager logs so that we could debug why it
crash
Hi,
I am running Flink v1.12.2 in Standalone mode on Kubernetes. I set Kubernetes
native as HA.
The HA works well when either jobmanager or taskmanager pod lost or crashes.
But, when I restart master node, jobmanager pod will always crash and restart.
This results in the entire Flink cluster r