Hi Greg, yes certainly, there are more requirements to this than the quick sketch I gave above and that seems to be one of them.
Cheers, Aljoscha On Thu, 22 Dec 2016 at 17:54 Greg Hogan <c...@greghogan.com> wrote: > Aljoscha, > > For the second, possible solution is there also a requirement that the > data sinks handle out-of-order writes? If the new job outpaces the old job > which is then terminated, the final write from the old job could have > overwritten "newer" writes from the new job. > > Greg > > On Tue, Dec 20, 2016 at 12:27 PM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > Hi, > zero-downtime updates are currently not supported. What is supported in > Flink right now is a savepoint-shutdown-restore cycle. With this, you first > draw a savepoint (which is essentially a checkpoint with some meta data), > then you cancel your job, then you do whatever you need to do (update > machines, update Flink, update Job) and restore from the savepoint. > > A possible solution for zero-downtime update would be to do a savepoint, > then start a second Flink job from that savepoint, then shutdown the first > job. With this, your data sinks would need to be able to handle being > written to by 2 jobs at the same time, i.e. writes should probably be > idempotent. > > This is the link to the savepoint doc: > https://ci.apache.org/projects/flink/flink-docs-release-1.2/setup/savepoints.html > > Does that help? > > Cheers, > Aljoscha > > On Fri, 16 Dec 2016 at 18:16 Andrew Hoblitzell <ahoblitz...@salesforce.com> > wrote: > > Hi. Does Apache Flink currently have support for zero down time or the = > ability to do rolling upgrades? > > If so, what are concerns to watch for and what best practices might = > exist? Are there version management and data inconsistency issues to = > watch for?= > > >