Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node

2021-07-12 Thread Jerome Li
ubject: Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node I think your attached exception has been fixed via FLINK-22597[1]. Could you please have a try with the latest version. Moreover, it is not the desired Flink behavior that TaskManager could not retrieve the new

Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node

2021-05-26 Thread Yang Wang
ache.flink.runtime.jobmaster.JobMaster.startJobMasterServices(JobMaster.java:891) > ~[flink-dist_2.12-1.12.2.jar:1.12.2] > > at > org.apache.flink.runtime.jobmaster.JobMaster.startJobExecution(JobMaster.java:864) > ~[flink-dist_2.12-1.12.2.jar:1.12.2] > >at

Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node

2021-05-26 Thread Jerome Li
behaviors? Or is it a bug? Or if I am missing something? Best, Jerome From: Yang Wang Date: Tuesday, May 25, 2021 at 1:03 AM To: Jerome Li Cc: user@flink.apache.org Subject: Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node By "restart master node", do y

Re: Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node

2021-05-25 Thread Yang Wang
By "restart master node", do you mean to restart the K8s master component(e.g. APIServer, ETCD, etc.)? Even though the master components are restarted, the Flink JobManager and TaskManager should eventually get to work. Could you please share the JobManager logs so that we could debug why it crash

Jobmanager Crashes with Kubernetes HA When Restart Kubernetes Master Node

2021-05-24 Thread Jerome Li
Hi, I am running Flink v1.12.2 in Standalone mode on Kubernetes. I set Kubernetes native as HA. The HA works well when either jobmanager or taskmanager pod lost or crashes. But, when I restart master node, jobmanager pod will always crash and restart. This results in the entire Flink cluster r