Niels Basjes created FLINK-4620: ----------------------------------- Summary: Automatically creating savepoints Key: FLINK-4620 URL: https://issues.apache.org/jira/browse/FLINK-4620 Project: Flink Issue Type: New Feature Components: State Backends, Checkpointing Affects Versions: 1.1.2 Reporter: Niels Basjes
In the current versions of Flink you can run an external command and then a savepoint is persisted in a durable location. Feature request: Make this a lot more automatic and easy to use. _Proposed workflow_ # In my application I do something like this: {code} StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setStateBackend(new FsStateBackend("hdfs:///tmp/applicationState")); env.enableCheckpointing(5000, CheckpointingMode.EXACTLY_ONCE); env.enableAutomaticSavePoints(300000); env.enableAutomaticSavePointCleaner(10); {code} # When I start the application for the first time the state backend is 'empty'. I expect the system to start in a clean state. After 10 minutes (300000ms) a savepoint is created and stored. # When I stop and start the topology again it will automatically restore the last available savepoint. Things to think about: * Note that this feature still means the manual version is useful!! * What to do on startup if the state is incompatible with the topology? Fail the startup? * How many automatic savepoints to we keep? Only the last one? * Perhaps the API should allow multiple automatic savepoints at different intervals in different locations. {code} // Make every 10 minutes and keep the last 10 env.enableAutomaticSavePoints(300000, new FsStateBackend("hdfs:///tmp/applicationState"), 10); // Make every 24 hours and keep the last 30 // Useful for being able to reproduce a problem a few days later env.enableAutomaticSavePoints(86400000, new FsStateBackend("hdfs:///tmp/applicationDailyStateSnapshot"), 30); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)