Hello, You are probably looking for this feature: https://issues.apache.org/jira/browse/FLINK-2976
Best, Gábor 2016-01-14 11:05 GMT+01:00 Niels Basjes <ni...@basjes.nl>: > Hi, > > I'm working on a streaming application using Flink. > Several steps in the processing are state-full (I use custom Windows and > state-full operators ). > > Now if during a normal run an worker fails the checkpointing system will be > used to recover. > > But what if the entire application is stopped (deliberately) or stops/fails > because of a problem? > > At this moment I have three main reasons/causes for doing this: > 1) The application just dies because of a bug on my side or a problem like > for example this (which I'm actually confronted with): Failed to Update > HDFS Delegation Token for long running application in HA mode > https://issues.apache.org/jira/browse/HDFS-9276 > 2) I need to rebalance my application (i.e. stop, change parallelism, start) > 3) I need a new version of my software to be deployed. (i.e. I fixed a bug, > changed the topology and need to continue) > > I assume the solution will be in some part be specific for my application. > The question is what features exist in Flink to support such a clean > "continue where I left of" scenario? > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes