I think it is the expected behavior. When the active JobManager loses
leadership, the standby one
will try to take over and recover the job from the latest successful
checkpoint.

The high availability just helps with leader election/retrieval and HA meta
storage(e.g. job graphs, checkpoints, etc.).
It could not avoid job restarts in JobManager failures.

Best,
Yang

Giselle van Dongen <giselle.vandon...@ugent.be> 于2021年1月6日周三 上午6:23写道:

> Hi!
>
>
> We are running a high available Flink cluster in standalone mode with
> Zookeeper with 2 jobmanagers and 5 taskmanagers.
>
> When the jobmanager is killed, the standby jobmanager takes over. But the
> job is also restarted.
>
> Is this the default behavior and can we avoid job restarts (for jobmanager
> failure) in some way?
>
>
> Thank you,
>
> Giselle
>

Reply via email to