Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-05 Thread Yang Wang
This might be related with FLINK-28481, which is a bug in fabric8io k8s client. [1]. https://issues.apache.org/jira/browse/FLINK-28481 Best, Yang On Tue, Feb 6, 2024 at 12:30 PM Lavkesh Lahngir wrote: > Hi, Matthias, I was wondering if there are any timeout or heartbeat > configurations for Ku

Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-05 Thread Lavkesh Lahngir
Hi, Matthias, I was wondering if there are any timeout or heartbeat configurations for KubeHA available. Thanks. On Mon, 5 Feb 2024 at 8:58 PM, Matthias Pohl wrote: > That's stated in the Jira issue. I didn't have the time to investigate it > further. > > On Mon, Feb 5, 2024 at 1:55 PM Lavkesh

Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-05 Thread Matthias Pohl
That's stated in the Jira issue. I didn't have the time to investigate it further. On Mon, Feb 5, 2024 at 1:55 PM Lavkesh Lahngir wrote: > Hi Matthias, > Thanks for the suggestion. Do we know which part of code caused this issue > and how it was fixed? > > Thanks! > > On Mon, 5 Feb 2024 at 18:06

Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-05 Thread Lavkesh Lahngir
Hi Matthias, Thanks for the suggestion. Do we know which part of code caused this issue and how it was fixed? Thanks! On Mon, 5 Feb 2024 at 18:06, Matthias Pohl wrote: > Hi Lavkesh, > FLINK-33998 [1] sounds quite similar to what you describe. > > The solution was to upgrade to Flink version 1.1

Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-05 Thread Matthias Pohl
Hi Lavkesh, FLINK-33998 [1] sounds quite similar to what you describe. The solution was to upgrade to Flink version 1.14.6. I didn't have the capacity to look into the details considering that the mentioned Flink version 1.14 is not officially supported by the community anymore and a fix seems to

Re: Frequent Flink JM restarts due to Kube API server errors.

2024-02-04 Thread Lavkesh Lahngir
Hii, Few more details: We are running GKE version 1.27.7-gke.1121002. and using flink version 1.14.3. Thanks! On Mon, 5 Feb 2024 at 12:05, Lavkesh Lahngir wrote: > Hii All, > > We run a Flink operator on GKE, deploying one Flink job per job manager. > We utilize > org.apache.flink.kubernetes.hi