Re: Flink HA for Job Cluster

2020-02-10 Thread KristoffSC
Thanks you both for answers. So I just want to have this right. I can I achieve HA for Job Cluster Docker config having the zookeeper quorum configured like mentioned in [1] right (with s3 and zookeeper)? I assume to modify default Job Cluster config to match the [1] setup. [1] https://ci.apach

Re: Flink HA for Job Cluster

2020-02-09 Thread KristoffSC
Thanks you both for answers. So I just want to have this right. I can I achieve HA for Job Cluster Docker config having the zookeeper quorum configured like mentioned in [1] right (with s3 and zookeeper)? I assume to modify default Job Cluster config to match the [1] setup. [1] https://ci.apach

Re: Flink HA for Job Cluster

2020-02-09 Thread Yang Wang
Just like tison has said, you could use a deployment to restart the jobmanager pod. However, if you want to make the all jobs could recover from the checkpoint, you also need to use the zookeeper and HDFS/S3 to store the high-availability data. Also some Kubernetes native HA support is in plan[1].

Re: Flink HA for Job Cluster

2020-02-09 Thread tison
Hi Krzysztof, Flink doesn't provide JM HA itself yet. For YARN deployment, you can rely on yarn.application-attempts configuration[1]; for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM. Though, such standalone mode doesn't tolerate JM failure and strategies above

Flink HA for Job Cluster

2020-02-07 Thread KristoffSC
Hi, In [1] where we can find setup for Stand Alone an YARN clusters to achieve Job Manager's HA. Is Standalone Cluster High Availability with a zookeeper the same approach for Docker's Job Cluster approach with Kubernetes? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobma