The recipe we used to get this working was increasing
kubernetes.operator.reconcile.interval and
kubernetes.operator.observer.progress-check.interval which essentially made
reconciliation slower but more smooth for applies across a large number of
bundled FlinkDeployments. We also bumped
kubernetes
Currently, 16GB of heap size is allocated to the flink-kubernetes-operator
container by setting *jvmArgs.operator*, and this didn't help either.
On Wed, Nov 8, 2023 at 5:56 PM Tony Chen wrote:
> Hi Flink Community,
>
> This is a follow-up on a previous email thread (see email thread below).
> Af
Hi Flink Community,
This is a follow-up on a previous email thread (see email thread below).
After changing the number of operator pods to 1, although we didn't
encounter the multiple leaders issue anymore, our singleton operator pod
restarts whenever we have 150+ FlinkDeployments. Sometimes, the