Hello, We are running flink on Kubernetes(Standalone) in application cluster mode. The job manager is deployed as a deployment. We only deploy one instance/replica of job manager. So, the leader election service is not required. And we have set flink task execution retries to infinite.
Do we still need a HA setup? We have tested our application without configuring the HA, and it seems to restore from checkpoints after failures. Does the flink job manager keep the information that it would otherwise store in HA system, in memory? If it does, then the only reason to configure HA is to achieve resiliency in case of pod evictions(caused by node failures or scheduling etc.)? Thanks, Omkar