Please also see my comment on https://issues.apache.org/jira/browse/FLINK-12619?focusedCommentId=16864098&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16864098 <https://issues.apache.org/jira/browse/FLINK-12619?focusedCommentId=16864098&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16864098>
For this FLIP-41 it means we go forward with the design basically as is but should call it “Unified Format” or something like it. If no-one else comments, we should proceed to a [VOTE] thread to formally adopt the FLIP. Aljoscha > On 14. Jun 2019, at 15:40, Yu Li <l...@apache.org> wrote: > > Hi Aljoscha and all, > > My 2 cents here: > > 1. Conceptually it worth a second thought about introducing an optimized > snapshot format for now (i.e. use checkpoint format in savepoint), just > like it's not recommended to use snapshot for backup in database (although > practically it could be implemented). > > 2. Stop-with-checkpoint mechanism is like stopping database instance with a > data flush, thus (IMHO) a different story from the checkpoint/savepoint (db > snapshot/backup) diversity. > > 3. In the long run we may improve the checkpoint to allow a short enough > interval thus it may become some format of transactional log, then we could > enable checkpoint-based savepoint (like transactional log based backup), so > I agree to still call the new format in FLIP-41 a "Unified Format" although > in the short term it only unifies savepoint. > > I've also wrote a document [1] to include more details and please refer to > it if interested. Thanks! > > [1] https://docs.google.com/document/d/1uE4R3wNal6e67FkDe0UvcnsIMMDpr35j > > Best Regards, > Yu > > > On Thu, 6 Jun 2019 at 19:42, Aljoscha Krettek <aljos...@apache.org> wrote: > >> Btw, I think this FLIP is a very good effort, we just need to reframe the >> effort a tiny bit. +1 >> >>> On 6. Jun 2019, at 13:41, Aljoscha Krettek <aljos...@apache.org> wrote: >>> >>> Hi, >>> >>> I had a brief discussion with Stephan that helped me sort my thoughts on >> the broader topics of checkpoints, savepoints, binary formats, >> user-triggered checkpoints, and periodic savepoints. I’ll try to summarise >> my stance on this and also comment with the same message on the other >> relevant Jira Issues and threads. >>> >>> For reference, the relevant FLIP and Jira issues are these: >>> >>> - >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints: >> < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-41:+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints:> >> Unified Savepoint Format >>> - https://issues.apache.org/jira/browse/FLINK-12619: Add support for >> stop-with-checkpoint >>> - https://issues.apache.org/jira/browse/FLINK-6755: User-triggered >> checkpoints >>> - https://issues.apache.org/jira/browse/FLINK-4620: Automatically >> creating savepoints >>> - https://issues.apache.org/jira/browse/FLINK-4511: Schedule periodic >> savepoints >>> >>> There are roughly two different dimensions in the topic of >> savepoints/checkpoints (I’ll use snapshot as the generic term for both): >>> 1) who controls the snapshot >>> 2) what’s the (binary) format of the snapshot >>> >>> For 1), we currently have checkpoints and savepoints. Checkpoints are >> created by the system for fault tolerance. They are managed by the system >> and the system is free to discard them when it sees fit. Savepoints are in >> the control of the user. A user can choose to create a save point, they can >> delete them, they can restore from them at will. The system will not clean >> up savepoints. We should try and keep this separation and not muddle the >> two concepts. >>> >>> For 2), we currently have various different formats between the >> different state backends and also for the same backend. I.e. RocksDB can do >> full or incremental snapshots, local snapshots, and probably more. >>> >>> FLIP-41 aims at introducing a unified “savepoint" format that is >> interchangeable between the different state backends. In light of the above >> points, we should say that FLIP-41 aims to introduce a canonical format >> that is interchangeable between different backends. This doesn’t mean that >> we should tie this format strictly to savepoints, though. For performance >> reasons, users might choose to do savepoints that use one of the optimised >> formats that the backends offer, for example incremental snapshots. Or they >> might choose to use the canonical format for regular checkpoints so that >> they can always switch between backends using periodically created >> externalised checkpoints. >>> >>> The motivation behind FLINK-12619 is to have a more lightweight >> alternative for stop-with-savepoint, for example using the incremental >> snapshot format that RocksDB has. With the above in mind, however, this >> becomes “Add support for choosing the snapshot format for >> stop-with-savepoint”. It should not be stop-with-checkpoint, because >> checkpoints are something that the system manages and not something that >> the user should trigger. The same is true for FLINK-6755, the motivation is >> the same I think. The change should be called “Add support for choosing the >> snapshot format for savepoints”, however. >>> >>> For the last two Jira issues mentioned above it should be quite clear >> what I think. I do, however, see a need for potentially different >> overlapping checkpoint periods or intervals. Users might want to have their >> regular checkpoints use an optimised format but they also want to have a >> “canonical format” checkpoint every no and then so that the lineage of >> incremental checkpoints does not become too unwieldy. >>> >>> Please let me know what you think! >>> >>> Aljoscha >>> >>>> On 5. Jun 2019, at 10:36, Tzu-Li (Gordon) Tai <tzuli...@apache.org> >> wrote: >>>> >>>> I want to quickly bump this discussion to gather more consensus from >> others >>>> on the FLIP, and see if we want to aim this for the upcoming 1.9.0 >> release. >>>> The proposal touches binary formats of savepoints, which is a major >> part of >>>> Flink's public user interface, so having explicit approval from other >>>> members of the community would be nice here. >>>> >>>> Cheers, >>>> Gordon >>>> >>>> On Wed, May 29, 2019 at 11:45 AM Tzu-Li (Gordon) Tai < >> tzuli...@apache.org> >>>> wrote: >>>> >>>>> I also should point out something that I forgot to mention in the >> initial >>>>> post: >>>>> Stefan has helped a lot in understanding the current status of state >>>>> backends and also participated a lot in design choices for the FLIP :) >>>>> >>>>> On Wed, May 29, 2019 at 5:02 AM Tzu-Li (Gordon) Tai < >> tzuli...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi Flink devs, >>>>>> >>>>>> Congxian, Kostas, and I have recently been discussing to unify the >> binary >>>>>> formats for keyed state in savepoints, which would allow for more >>>>>> operational flexibility such as swapping state backends across >> restores. >>>>>> >>>>>> As part of this FLIP, another main proposal is to start allowing >>>>>> checkpoints and savepoints to have different formats. Savepoint >> formats >>>>>> should in the future be designed with interoperability in mind and >>>>>> reasonable snapshot / restore overhead is tolerable, while >> checkpoints are >>>>>> allowed to be backend specific for more efficient snapshots and >> restores. >>>>>> From recent proposals in the state backends such as disk-spilling heap >>>>>> backend [1], this flexibility seems to be reasonable. >>>>>> >>>>>> The main user-facing API this would affect is of course, the binary >>>>>> formats of savepoints, as well as the fact that we will no longer be >>>>>> guaranteeing functional parity between savepoints and full >> checkpoints in >>>>>> the future (w.r.t. operational features related to upgrading >> applications; >>>>>> so far they have equal functionality). >>>>>> >>>>>> Therefore, we would like to collect feedback on the proposal before >>>>>> continuing efforts. >>>>>> >>>>>> This is the FLIP: >>>>>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints >>>>>> . >>>>>> >>>>>> I'm happy to discuss details and looking forward to any feedback. >>>>>> >>>>>> Cheers, >>>>>> Gordon >>>>>> >>>>>> [1] >>>>>> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Proposal-to-support-disk-spilling-in-HeapKeyedStateBackend-td29109.html >>>>>> >>>>> >>> >> >>