Job Manager becomes irresponsive if the size of the session cluster grows

Prakhar Mathur Wed, 17 Jul 2019 22:54:41 -0700

Hello,

We have deployed multiple Flink clusters on Kubernetess with 1 replica of
Jobmanager and multiple of Taskmanager as per the requirement. Recently we
are observing that on increasing the number of Taskmanagers for a cluster,
the Jobmanager becomes irresponsive. It stops sending statsd metric for
some irregular interval. Even the Jobmanager pod keeps restarting because
it stops responding to the liveliness probe which results in Kubernetes
killing the pod. We tried increasing the resources given(CPU, RAM) but it
didn't help.


Regards
Prakhar Mathur
Product Engineer
GO-JEK

Job Manager becomes irresponsive if the size of the session cluster grows

Reply via email to