Hi, Can we have multiple replicas with ZK HA in K8 as well?In this case, how does Task Managers and clients recover the Job Manager RPC address? Are they updated in ZK?Also, since there are 3 replicas behind the same service endpoint and only one of them is the leader, how should clients reach the leader Job Manager? On Wednesday, 20 January, 2021, 07:41:20 am IST, Yang Wang <danrtsey...@gmail.com> wrote: If you do not want to run multiple JobManagers simultaneously, then I think the "Job" for application clusterwith HA enable is enough.K8s will also launch a new pod/container when the old one terminated exceptionally. Best,Yang Yang Wang <danrtsey...@gmail.com> 于2021年1月20日周三 上午10:08写道:
Yes. Using a "Deployment" instead of "Job" for the application cluster also makes sense.Actually, in the native K8s integration, we always use the deployment for JobManager. But please note that the deployment may relaunch the JobManager pod even though you cancelthe Flink job. Best,Yang Ashish Nigam <ashnigamt...@gmail.com> 于2021年1月20日周三 上午5:29写道: Yang,For Application clusters, does it make sense to deploy JobManager as "Deployment" rather than as a "Job", as suggested in docs?I am asking this because I am thinking of deploying a job manager in HA mode even for application clusters. ThanksAshish On Tue, Jan 19, 2021 at 6:16 AM Yang Wang <danrtsey...@gmail.com> wrote: Usually, you do not need to start multiple JobManager simultaneously. The JobManager is a deployment.A new one pod/container will be launched once it terminated exceptionally. If you still want to start multiple JobManagers to get a faster recovery, you could set the replica greater than 1for standalone cluster on K8s[1]. For native integration[2], we still have not supported such configuration[2]. Please note that the key point to enable HA is not start multiple JobManagers simultaneously or sequently.You need to set the ZooKeeperHAService[4] or KubernetesHAService[5] to ensure the Flink job could recoverfrom latest successful checkpoint. [1]. https://ci.apache.org/projects/flink/flink-docs-master/deployment/resource-providers/standalone/kubernetes.html#session-cluster-resource-definitions[2]. https://ci.apache.org/projects/flink/flink-docs-master/deployment/resource-providers/native_kubernetes.html[3]. https://issues.apache.org/jira/browse/FLINK-17707[4]. https://ci.apache.org/projects/flink/flink-docs-master/deployment/ha/zookeeper_ha.html[5]. https://ci.apache.org/projects/flink/flink-docs-master/deployment/ha/kubernetes_ha.html Best,Yang Amit Bhatia <bhatia.amit1...@gmail.com> 于2021年1月19日周二 下午8:45写道: Hi, I am deploying Flink 1.12 on K8s. Can anyone confirm if we can deploy multiple job manager pods in K8s for HA or it should always be only a single job manager pod ? Regards,Amit Bhatia