Hi Prakhar,

Have you ever checked the garbage collection of master?
Which version of Flink are you using? How many TaskManagers in your
cluster?


Prakhar Mathur <prakha...@go-jek.com> 于2019年7月18日周四 下午1:54写道:

> Hello,
>
> We have deployed multiple Flink clusters on Kubernetess with 1 replica of
> Jobmanager and multiple of Taskmanager as per the requirement. Recently we
> are observing that on increasing the number of Taskmanagers for a cluster,
> the Jobmanager becomes irresponsive. It stops sending statsd metric for
> some irregular interval. Even the Jobmanager pod keeps restarting because
> it stops responding to the liveliness probe which results in Kubernetes
> killing the pod. We tried increasing the resources given(CPU, RAM) but it
> didn't help.
>
> Regards
> Prakhar Mathur
> Product Engineer
> GO-JEK
>

Reply via email to