Just like tison has said, you could use a deployment to restart the
jobmanager pod. However,
if you want to make the all jobs could recover from the checkpoint, you
also need to use the
zookeeper and HDFS/S3 to store the high-availability data.

Also some Kubernetes native HA support is in plan[1]. After that, you will
not depend on
zookeeper.

[1]. https://issues.apache.org/jira/browse/FLINK-12884

tison <wander4...@gmail.com> 于2020年2月10日周一 上午8:59写道:

> Hi Krzysztof,
>
> Flink doesn't provide JM HA itself yet.
>
> For YARN deployment, you can rely on yarn.application-attempts
> configuration[1];
> for Kubernetes deployment, Flink uses Kubernetes deployment to restart a
> failed JM.
>
> Though, such standalone mode doesn't tolerate JM failure and strategies
> above just
> restart the application, which means all tasks will be killed and
> restarted.
>
> Best,
> tison.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html#configuration-1
>
>
> KristoffSC <krzysiek.chmielew...@gmail.com> 于2020年2月7日周五 下午11:34写道:
>
>> Hi,
>> In [1] where we can find setup for Stand Alone an YARN clusters to achieve
>> Job Manager's HA.
>>
>> Is Standalone Cluster High Availability with a zookeeper the same approach
>> for Docker's Job Cluster approach with Kubernetes?
>>
>> [1]
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html
>>
>> Thanks,
>> Krzysztof
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>

Reply via email to