Re: [DISCUSS] Support Suspending and Resuming of Flink Jobs

Greg Hogan Wed, 12 Oct 2016 05:35:50 -0700

Sorry, I haven't followed this development, but roughly how much more
costly is the new serialization for savepoints?


On Wed, Oct 12, 2016 at 5:51 AM, SHI Xiaogang <shixiaoga...@gmail.com>
wrote:

> Hi all,
>
> Currently, savepoints are exactly the completed checkpoints, and Flink
> provides commands (save/run) to allow saving and restoring jobs. But in the
> near future, savepoints will be very different from checkpoints because
> they will have common serialization formats and allow recover from major
> updates. The saving and restoring based on savepoints will be more costly.
>
> To provide efficient saving and restoring of jobs, we propose to add two
> more commands in Flink: SUSPEND and RESUME which are based on checkpoints.
>
> As the implementation of checkpoints depends on the backends (and many
> other components in Flink), suspending and resuming may not work if there
> exist major changes in the job or Flink (e.g., different backends). But as
> the implementation is based on checkpoints instead of savepoints, they are
> supposed to be more efficient.
>
> The details of the design can be viewed in the Google Doc: Support Resuming
> and Suspending of Flink Jobs
> <https://docs.google.com/document/d/1c3vUOTrNlCu2uhfi5ZNYpAguoFR03
> NgQWZpDTkSxVjg/edit?usp=sharing>
> .
>
> Look forward to your comments. Any feedback is appreciated. :)
>
> Thanks,
> Xiaogang
>

Re: [DISCUSS] Support Suspending and Resuming of Flink Jobs

Reply via email to