Hi, Thanks for the comments and questions. Starting from the top:
Seth: good point about schema evolution. Actually, I have a very similar question to State Processor API. Is it the same scenario in this case? Should it also be working with checkpoints but might be just untested? And next question, should we commit to supporting those two things (State Processor API and schema evolution) for native savepoints? What about aligned checkpoints? (please check [1] for that). Yu Li: 1, 2 and 4 done. > 3. How about changing the description of "the default configuration of the > checkpoints will be used to determine whether the savepoint should be > incremental or not" to something like "the `state.backend.incremental` > setting now denotes the type of native format snapshot and will take effect > for both checkpoint and savepoint (with native type)", to prevent concept > confusion between checkpoint and savepoint? Is `state.backend.incremental` the only configuration parameter that can be used in this context? I would guess not? What about for example "state.storage.fs.memory-threshold" or all of the Advanced RocksDB State Backends Options [2]? David: > does this mean that we need to keep the checkpoints compatible across minor > versions? Or can we say, that the minor version upgrades are only > guaranteed with canonical savepoints? Good question. Frankly I was always assuming that this is implicitly given. Otherwise users would not be able to recover jobs that are failing because of bugs in Flink. But I'm pretty sure that was never explicitly stated. As Konstantin suggested, I've written down the pre-existing guarantees of checkpoints and savepoints followed by two proposals on how they should be changed [1]. Could you take a look? I'm especially unsure about the following things: a) What about RocksDB upgrades? If we bump RocksDB version between Flink versions, do we support recovering from a native format snapshot (incremental checkpoint)? b) State Processor API - both pre-existing and what do we want to provide in the future c) Schema Evolution - both pre-existing and what do we want to provide in the future Best, Piotrek [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Checkpointvssavepointguarantees [2] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#advanced-rocksdb-state-backends-options wt., 11 sty 2022 o 09:45 Konstantin Knauf <kna...@apache.org> napisał(a): > Hi Piotr, > > would it be possible to provide a table that shows the > compatibility guarantees provided by the different snapshots going forward? > Like type of change (Topology. State Schema, Parallelism, ..) in one > dimension, and type of snapshot as the other dimension. Based on that, it > would be easier to discuss those guarantees, I believe. > > Cheers, > > Konstantin > > On Mon, Jan 3, 2022 at 9:11 AM David Morávek <d...@apache.org> wrote: > > > Hi Piotr, > > > > does this mean that we need to keep the checkpoints compatible across > minor > > versions? Or can we say, that the minor version upgrades are only > > guaranteed with canonical savepoints? > > > > My concern is especially if we'd want to change layout of the checkpoint. > > > > D. > > > > > > > > On Wed, Dec 29, 2021 at 5:19 AM Yu Li <car...@gmail.com> wrote: > > > > > Thanks for the proposal Piotr! Overall I'm +1 for the idea, and below > are > > > my two cents: > > > > > > 1. How about adding a "Term Definition" section and clarify what > "native > > > format" (the "native" data persistence format of the current state > > backend) > > > and "canonical format" (the "uniform" format that supports switching > > state > > > backends) means? > > > > > > 2. IIUC, currently the FLIP proposes to only support incremental > > savepoint > > > with native format, and there's no plan to add such support for > canonical > > > format, right? If so, how about writing this down explicitly in the > FLIP > > > doc, maybe in a "Limitations" section, plus the fact that > > > `HashMapStateBackend` cannot support incremental savepoint before > > FLIP-151 > > > is done? (side note: @Roman just a kindly reminder, that please take > > > FLIP-203 into account when implementing FLIP-151) > > > > > > 3. How about changing the description of "the default configuration of > > the > > > checkpoints will be used to determine whether the savepoint should be > > > incremental or not" to something like "the `state.backend.incremental` > > > setting now denotes the type of native format snapshot and will take > > effect > > > for both checkpoint and savepoint (with native type)", to prevent > concept > > > confusion between checkpoint and savepoint? > > > > > > 4. How about putting the notes of behavior change (the default type of > > > savepoint will be changed to `native` in the future, and by then the > > taken > > > savepoint cannot be used to switch state backends by default) to a more > > > obvious place, for example moving from the "CLI" section to the > > > "Compatibility" section? (although it will only happen in 1.16 release > > > based on the proposed plan) > > > > > > And all above suggestions apply for our user-facing document after the > > FLIP > > > is (partially or completely, accordingly) done, if taken (smile). > > > > > > Best Regards, > > > Yu > > > > > > > > > On Tue, 21 Dec 2021 at 22:23, Seth Wiesman <sjwies...@gmail.com> > wrote: > > > > > > > >> AFAIK state schema evolution should work both for native and > > canonical > > > > >> savepoints. > > > > > > > > Schema evolution does technically work for both formats, it happens > > after > > > > the code paths have been unified, but the community has up until this > > > point > > > > considered that an unsupported feature. From my perspective making > this > > > > supported could be as simple as adding test coverage but that's an > > active > > > > decision we'd need to make. > > > > > > > > On Tue, Dec 21, 2021 at 7:43 AM Piotr Nowojski <pnowoj...@apache.org > > > > > > wrote: > > > > > > > > > Hi Konstantin, > > > > > > > > > > > In this context: will the native format support state schema > > > evolution? > > > > > If > > > > > > not, I am not sure, we can let the format default to native. > > > > > > > > > > AFAIK state schema evolution should work both for native and > > canonical > > > > > savepoints. > > > > > > > > > > Regarding what is/will be supported we will document as part of > this > > > > > FLIP-203. But it's not as simple as just the difference between > > native > > > > and > > > > > canonical formats. > > > > > > > > > > Best, Piotrek > > > > > > > > > > pon., 20 gru 2021 o 14:28 Konstantin Knauf <kna...@apache.org> > > > > napisał(a): > > > > > > > > > > > Hi Piotr, > > > > > > > > > > > > Thanks a lot for starting the discussion. Big +1. > > > > > > > > > > > > In my understanding, this FLIP introduces the snapshot format as > a > > > > > *really* > > > > > > user facing concept. IMO it is important that we document > > > > > > > > > > > > a) that it is not longer the checkpoint/savepoint characteristics > > > that > > > > > > determines the kind of changes that a snapshots allows (user > code, > > > > state > > > > > > schema evolution, topology changes), but now this becomes a > > property > > > of > > > > > the > > > > > > format regardless of whether this is a snapshots or a checkpoint > > > > > > b) the exact changes that each format allows (code, state schema, > > > > > topology, > > > > > > state backend, max parallelism) > > > > > > > > > > > > In this context: will the native format support state schema > > > evolution? > > > > > If > > > > > > not, I am not sure, we can let the format default to native. > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Konstantin > > > > > > > > > > > > > > > > > > On Mon, Dec 20, 2021 at 2:09 PM Piotr Nowojski < > > pnowoj...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi devs, > > > > > > > > > > > > > > I would like to start a discussion about a previously announced > > > > follow > > > > > up > > > > > > > of the FLIP-193 [1], namely allowing savepoints to be in native > > > > format > > > > > > and > > > > > > > incremental. The changes do not seem invasive. The full > proposal > > is > > > > > > > written down as FLIP-203: Incremental savepoints [2]. Please > > take a > > > > > look, > > > > > > > and let me know what you think. > > > > > > > > > > > > > > Best, > > > > > > > Piotrek > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership > > > > > > > [2] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-203%3A+Incremental+savepoints#FLIP203:Incrementalsavepoints-Semantic > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Konstantin Knauf > > > > > > > > > > > > https://twitter.com/snntrable > > > > > > > > > > > > https://github.com/knaufk > > > > > > > > > > > > > > > > > > > > > > > -- > > Konstantin Knauf > > https://twitter.com/snntrable > > https://github.com/knaufk >