Sorry, I haven't followed this development, but roughly how much more costly is the new serialization for savepoints?
On Wed, Oct 12, 2016 at 5:51 AM, SHI Xiaogang <shixiaoga...@gmail.com> wrote: > Hi all, > > Currently, savepoints are exactly the completed checkpoints, and Flink > provides commands (save/run) to allow saving and restoring jobs. But in the > near future, savepoints will be very different from checkpoints because > they will have common serialization formats and allow recover from major > updates. The saving and restoring based on savepoints will be more costly. > > To provide efficient saving and restoring of jobs, we propose to add two > more commands in Flink: SUSPEND and RESUME which are based on checkpoints. > > As the implementation of checkpoints depends on the backends (and many > other components in Flink), suspending and resuming may not work if there > exist major changes in the job or Flink (e.g., different backends). But as > the implementation is based on checkpoints instead of savepoints, they are > supposed to be more efficient. > > The details of the design can be viewed in the Google Doc: Support Resuming > and Suspending of Flink Jobs > <https://docs.google.com/document/d/1c3vUOTrNlCu2uhfi5ZNYpAguoFR03 > NgQWZpDTkSxVjg/edit?usp=sharing> > . > > Look forward to your comments. Any feedback is appreciated. :) > > Thanks, > Xiaogang >