Hi Devs, Currently there is a number of efforts around checkpoints/savepoints, as reflected by the number of FLIPs. From a quick look FLIP-34, FLIP-41, FLIP-43, and FLIP-45 are all directly related to these topics. This reflects the importance of these two notions/features to the users of the framework.
Although many efforts are centred around these notions, their semantics and the interplay between them is not always clearly defined. This makes them difficult to explain them to the users (all the different combinations of state-backends, formats and tradeoffs) and in some cases it may have negative effects to the users (e.g. the already-fixed-some-time-ago issue of savepoints not being considered for recovery although they committed side-effects). FLIP-47 [1] and the related Document [2] is aiming at starting a discussion around the semantics of savepoints/checkpoints and their interplay, and to some extent help us fix the future steps concerning these notions. As an example, should we work towards bringing them closer, or moving them further apart. This is not a complete proposal (by no means), as many of the practical implications can only be fleshed out after we agree on the basic semantics and the general frame around these notions. To that end, there are no concrete implementation steps and the FLIP is going to be updated as the discussion continues. I am really looking forward to your opinions on the topic. Cheers, Kostas [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-47%3A+Checkpoints+vs.+Savepoints [2] https://docs.google.com/document/d/1_1FF8D3u0tT_zHWtB-hUKCP_arVsxlmjwmJ-TvZd4fs/edit?usp=sharing