Btw, I think this FLIP is a very good effort, we just need to reframe the effort a tiny bit. +1
> On 6. Jun 2019, at 13:41, Aljoscha Krettek <aljos...@apache.org> wrote: > > Hi, > > I had a brief discussion with Stephan that helped me sort my thoughts on the > broader topics of checkpoints, savepoints, binary formats, user-triggered > checkpoints, and periodic savepoints. I’ll try to summarise my stance on this > and also comment with the same message on the other relevant Jira Issues and > threads. > > For reference, the relevant FLIP and Jira issues are these: > > - > https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints: > > <https://cwiki.apache.org/confluence/display/FLINK/FLIP-41:+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints:> > Unified Savepoint Format > - https://issues.apache.org/jira/browse/FLINK-12619: Add support for > stop-with-checkpoint > - https://issues.apache.org/jira/browse/FLINK-6755: User-triggered checkpoints > - https://issues.apache.org/jira/browse/FLINK-4620: Automatically creating > savepoints > - https://issues.apache.org/jira/browse/FLINK-4511: Schedule periodic > savepoints > > There are roughly two different dimensions in the topic of > savepoints/checkpoints (I’ll use snapshot as the generic term for both): > 1) who controls the snapshot > 2) what’s the (binary) format of the snapshot > > For 1), we currently have checkpoints and savepoints. Checkpoints are created > by the system for fault tolerance. They are managed by the system and the > system is free to discard them when it sees fit. Savepoints are in the > control of the user. A user can choose to create a save point, they can > delete them, they can restore from them at will. The system will not clean up > savepoints. We should try and keep this separation and not muddle the two > concepts. > > For 2), we currently have various different formats between the different > state backends and also for the same backend. I.e. RocksDB can do full or > incremental snapshots, local snapshots, and probably more. > > FLIP-41 aims at introducing a unified “savepoint" format that is > interchangeable between the different state backends. In light of the above > points, we should say that FLIP-41 aims to introduce a canonical format that > is interchangeable between different backends. This doesn’t mean that we > should tie this format strictly to savepoints, though. For performance > reasons, users might choose to do savepoints that use one of the optimised > formats that the backends offer, for example incremental snapshots. Or they > might choose to use the canonical format for regular checkpoints so that they > can always switch between backends using periodically created externalised > checkpoints. > > The motivation behind FLINK-12619 is to have a more lightweight alternative > for stop-with-savepoint, for example using the incremental snapshot format > that RocksDB has. With the above in mind, however, this becomes “Add support > for choosing the snapshot format for stop-with-savepoint”. It should not be > stop-with-checkpoint, because checkpoints are something that the system > manages and not something that the user should trigger. The same is true for > FLINK-6755, the motivation is the same I think. The change should be called > “Add support for choosing the snapshot format for savepoints”, however. > > For the last two Jira issues mentioned above it should be quite clear what I > think. I do, however, see a need for potentially different overlapping > checkpoint periods or intervals. Users might want to have their regular > checkpoints use an optimised format but they also want to have a “canonical > format” checkpoint every no and then so that the lineage of incremental > checkpoints does not become too unwieldy. > > Please let me know what you think! > > Aljoscha > >> On 5. Jun 2019, at 10:36, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: >> >> I want to quickly bump this discussion to gather more consensus from others >> on the FLIP, and see if we want to aim this for the upcoming 1.9.0 release. >> The proposal touches binary formats of savepoints, which is a major part of >> Flink's public user interface, so having explicit approval from other >> members of the community would be nice here. >> >> Cheers, >> Gordon >> >> On Wed, May 29, 2019 at 11:45 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> >> wrote: >> >>> I also should point out something that I forgot to mention in the initial >>> post: >>> Stefan has helped a lot in understanding the current status of state >>> backends and also participated a lot in design choices for the FLIP :) >>> >>> On Wed, May 29, 2019 at 5:02 AM Tzu-Li (Gordon) Tai <tzuli...@apache.org> >>> wrote: >>> >>>> Hi Flink devs, >>>> >>>> Congxian, Kostas, and I have recently been discussing to unify the binary >>>> formats for keyed state in savepoints, which would allow for more >>>> operational flexibility such as swapping state backends across restores. >>>> >>>> As part of this FLIP, another main proposal is to start allowing >>>> checkpoints and savepoints to have different formats. Savepoint formats >>>> should in the future be designed with interoperability in mind and >>>> reasonable snapshot / restore overhead is tolerable, while checkpoints are >>>> allowed to be backend specific for more efficient snapshots and restores. >>>> From recent proposals in the state backends such as disk-spilling heap >>>> backend [1], this flexibility seems to be reasonable. >>>> >>>> The main user-facing API this would affect is of course, the binary >>>> formats of savepoints, as well as the fact that we will no longer be >>>> guaranteeing functional parity between savepoints and full checkpoints in >>>> the future (w.r.t. operational features related to upgrading applications; >>>> so far they have equal functionality). >>>> >>>> Therefore, we would like to collect feedback on the proposal before >>>> continuing efforts. >>>> >>>> This is the FLIP: >>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints >>>> . >>>> >>>> I'm happy to discuss details and looking forward to any feedback. >>>> >>>> Cheers, >>>> Gordon >>>> >>>> [1] >>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Proposal-to-support-disk-spilling-in-HeapKeyedStateBackend-td29109.html >>>> >>> >