Hey Niels,

as Gabor wrote, this feature has been merged to the master branch recently.

The docs are online here: 
https://ci.apache.org/projects/flink/flink-docs-master/apis/savepoints.html

Feel free to report back your experience with it if you give it a try.

– Ufuk

> On 14 Jan 2016, at 11:09, Gábor Gévay <gga...@gmail.com> wrote:
> 
> Hello,
> 
> You are probably looking for this feature:
> https://issues.apache.org/jira/browse/FLINK-2976
> 
> Best,
> Gábor
> 
> 
> 
> 
> 2016-01-14 11:05 GMT+01:00 Niels Basjes <ni...@basjes.nl>:
>> Hi,
>> 
>> I'm working on a streaming application using Flink.
>> Several steps in the processing are state-full (I use custom Windows and
>> state-full operators ).
>> 
>> Now if during a normal run an worker fails the checkpointing system will be
>> used to recover.
>> 
>> But what if the entire application is stopped (deliberately) or stops/fails
>> because of a problem?
>> 
>> At this moment I have three main reasons/causes for doing this:
>> 1) The application just dies because of a bug on my side or a problem like
>> for example this (which I'm actually confronted with):  Failed to Update
>> HDFS Delegation Token for long running application in HA mode
>> https://issues.apache.org/jira/browse/HDFS-9276
>> 2) I need to rebalance my application (i.e. stop, change parallelism, start)
>> 3) I need a new version of my software to be deployed. (i.e. I fixed a bug,
>> changed the topology and need to continue)
>> 
>> I assume the solution will be in some part be specific for my application.
>> The question is what features exist in Flink to support such a clean
>> "continue where I left of" scenario?
>> 
>> --
>> Best regards / Met vriendelijke groeten,
>> 
>> Niels Basjes

Reply via email to