Hi,
zero-downtime updates are currently not supported. What is supported in
Flink right now is a savepoint-shutdown-restore cycle. With this, you first
draw a savepoint (which is essentially a checkpoint with some meta data),
then you cancel your job, then you do whatever you need to do (update
machines, update Flink, update Job) and restore from the savepoint.

A possible solution for zero-downtime update would be to do a savepoint,
then start a second Flink job from that savepoint, then shutdown the first
job. With this, your data sinks would need to be able to handle being
written to by 2 jobs at the same time, i.e. writes should probably be
idempotent.

This is the link to the savepoint doc:
https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html

Does that help?

Cheers,
Aljoscha

On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell <ahoblitz...@salesforce.com>
wrote:

> Hi. Does Apache Flink currently have support for zero down time or the =
> ability to do rolling upgrades?
>
> If so, what are concerns to watch for and what best practices might =
> exist? Are there version management and data inconsistency issues to =
> watch for?=
>

Reply via email to