Thanks for the detailed replay, +1 from my side. Best, Congxian
Seth Wiesman <sjwies...@gmail.com> 于2020年9月17日周四 上午1:33写道: > Hi Stephan, > > Regarding backward compatibility, I agree and the intention is that all > existing code will continue to function with the same semantics. My working > idea is to remove the two checkpoint-storage related methods from > StateBackend into a new SnapshotStorage interface but then have > AbstractFileStateBackend and RocksDBStateBackend implement snapshot > storage. If a state backend implements SnapshotStorage it will be used > unconditionally, even if a different snapshot storage implementation is > configured. This way we don't change any of the concrete classes that users > interact with. The only people who would see breaking changes are state > backend implementors and they only need to add `implements SnapshotStorage` > to their class. > > The reason I went with SnapshotStorage is there is already an interface > `org.apache.flink.runtime.state.CheckpointStorage` in flink-runtime. If we > can rename this interface to something else I'm happy to take the name, but > if not I think it will lead to import confusion. > > Seth > > On Wed, Sep 16, 2020 at 11:54 AM Stephan Ewen <se...@apache.org> wrote: > > > @Yun and @Congxian: > > > > I think "async", "incremental", and similar flags belong very much with > the > > state backend (the index structure). > > They define how the snapshotting procedure behaves. > > > > The SnapshotStorage is really just about storage of checkpoint streams > > (bytes) and handles and pointers. > > > > Best, > > Stephan > > > > > > On Wed, Sep 16, 2020 at 6:48 PM Stephan Ewen <se...@apache.org> wrote: > > > > > Thanks for the great suggestion and the great discussion. Generally big > > +1 > > > to this effort. > > > Some thoughts from my side: > > > > > > *## Backwards Compatibility* > > > > > > I think we should really strive to make this non breaking. Maybe we > have > > > new classes / interfaces for StateBackends and CheckpointStorage and > let > > > the existing State Backend classes implement both (and deprecate them)? > > > > > > In the past, I have gotten some harsh comments from users about > breaking > > > long-time effectively stable APIs, so let's try hard to avoid this > > (unless > > > it makes things impossible). > > > > > > *## Naming* > > > > > > HashMapStateBackend sounds good to me > > > > > > Could we rename the SnapshotStorage to CheckpointStorage? Or converge > all > > > methods around "Snapshot"? > > > I think we already have some confusion from mixing the terms checkpoint > > > and snapshot and should converge in either direction. > > > I am slightly leaning towards converging around checkpoints, because > > > that's the most commonly known term among users as far as I can tell. > > > Checkpoints are Snapshots. But one could also just call them > Checkpoints > > > and let Savepoints be special Checkpoints. > > > > > > *## Integrated State / Storage Backends* > > > > > > There is an idea floating around now and then about a Cassandra backend > > > (or other K/V store) where the state index and durable location are > > tightly > > > intertwined. > > > However, I think this would not contradict, because it might just mean > > > that the checkpoint storage is used less (maybe only for savepoints, or > > for > > > WALs). > > > > > > *## Future Fault Tolerance Ideas* > > > > > > I think this conflicts with none of the future fault tolerance ideas I > am > > > involved with. > > > Similar to the above, there is always some checkpoint storage involved, > > > for example for savepoints or for backup/consolidation, so no problem. > > > > > > > > > Best, > > > Stephan > > > > > > On Wed, Sep 16, 2020 at 5:11 PM Aljoscha Krettek <aljos...@apache.org> > > > wrote: > > > > > >> I think the mentioned settings should be in the state backend. They > > >> configure how a certain backend writes to a snapshot storage, but it’s > > >> still the backend that has the logic and decides. > > >> > > >> I think it's a good point, though, to be conscious about those > settings. > > >> I'm sure we can figure out the details during implementation, though. > > >> > > >> Best, > > >> Aljoscha > > >> > > >> On 16.09.20 16:54, Seth Wiesman wrote: > > >> > Hi Congxian, > > >> > > > >> > There is an allusion to those configs in the wiki but let me better > > >> spell > > >> > out my thinking. The flink-conf configurations will not change and I > > >> > believe the java code switches should remain on the state backend > > >> objects. > > >> > > > >> > We are of course not fully disentangling state backends from > snapshots > > >> and > > >> > these configurations affect how your state backend runs in > > production. I > > >> > believe users would find it strange to have configurations like > > >> > `state.backend.rocksdb.checkpoint.transfer.thred.num` not be part of > > the > > >> > EmbeddedRocksdbStateBackend but somewhere else. This then leads to > the > > >> > question, is it better to split configurations between multiple > places > > >> or > > >> > not. Users appreciate consistency, and so having all the > > configurations > > >> on > > >> > the state backend objects makes them more discoverable and your > > >> application > > >> > easier to reason about. > > >> > > > >> > Additionally, I view these as advanced configurations. My hope is > most > > >> > users can simply use the no-arg constructor for a state backend in > > >> > production. If a user is changing the number of rocksdb transfer > > >> threads or > > >> > disabling async checkpoints, they likely know what they are doing. > > >> > > > >> > Please let me know if you have any concerns or would like to cancel > > the > > >> > vote. > > >> > > > >> > Seth > > >> > > > >> > On Wed, Sep 16, 2020 at 12:37 AM Congxian Qiu < > qcx978132...@gmail.com > > > > > >> > wrote: > > >> > > > >> >> Sorry for jump late in. > > >> >> > > >> >> I like the separation here, this separation makes more user > friendly > > >> now. > > >> >> > > >> >> I just wonder how the configuration such as > > >> 'state.backend.incremental', > > >> >> 'state.backend.async' and > > >> >> `state.backend.rocksdb.checkpoint.transfer.thred.num` will be > > >> configured > > >> >> after the separation, I think these configurations are more related > > to > > >> >> snapshots (maybe a little strange to configure these on > statebackend > > >> side). > > >> >> did not see this on the FLIP wiki currently. > > >> >> > > >> >> Best, > > >> >> Congxian > > >> >> > > >> >> > > >> >> Seth Wiesman <sjwies...@gmail.com> 于2020年9月15日周二 下午9:51写道: > > >> >> > > >> >>> Sounds good to me. I'll update the FLIP. > > >> >>> > > >> >>> On Tue, Sep 15, 2020 at 8:35 AM Dawid Wysakowicz < > > >> dwysakow...@apache.org > > >> >>> > > >> >>> wrote: > > >> >>> > > >> >>>> There is a good number of precedents that introduced backwards > > >> >>>> incompatible changes to that interface (which is PublicEvolving > > btw). > > >> >> We > > >> >>>> introduced a couple of additional arguments to the > > >> >>>> createKeyedStateBackend method and later on removed the methods > > with > > >> >>>> default implementation for backwards compatibility. I want to > > >> introduce > > >> >>>> a backward incompatible change in FLIP-140 (replace the > > >> >>>> AbstractKeyedStateBackend with an interface). From my perspective > > we > > >> >>>> should just do these changes. The impact should be minimal. > > >> >>>> > > >> >>>> Best, > > >> >>>> > > >> >>>> Dawid > > >> >>>> > > >> >>>> > > >> >>>> On 15/09/2020 15:20, Seth Wiesman wrote: > > >> >>>>> Hey Dawid, > > >> >>>>> > > >> >>>>> I didn't want to break compatibility but if there is precedent > and > > >> >>>> everyone > > >> >>>>> is ok with it then I'm +1. > > >> >>>>> > > >> >>>>> Seth > > >> >>>>> > > >> >>>>> On Tue, Sep 15, 2020 at 2:22 AM Dawid Wysakowicz < > > >> >>> dwysakow...@apache.org > > >> >>>>> > > >> >>>>> wrote: > > >> >>>>> > > >> >>>>>> Sorry for joining so late. > > >> >>>>>> > > >> >>>>>> Generally speaking I like this idea very much! > > >> >>>>>> > > >> >>>>>> I have one idea about the StateBackend interface. Could we > > instead > > >> >> of > > >> >>>>>> adding a flag method boolean isLegacyStateBackend remove the > > >> >>>>>> checkpointstorage related methods from StateBackend right away? > > The > > >> >>>>>> old/legacy implementations could then implement both > StateBackend > > >> >> and > > >> >>>>>> SnapshotStorage. In turn in the method env.setStateBackend we > > could > > >> >>> do: > > >> >>>>>> > > >> >>>>>> setStateBackend(StateBackend backend) { > > >> >>>>>> > > >> >>>>>> this.stateBackend = backend; > > >> >>>>>> > > >> >>>>>> if (backend instanceof SnapshotStorage) { > > >> >>>>>> > > >> >>>>>> this.setSnapshotStorage(backend); > > >> >>>>>> > > >> >>>>>> } > > >> >>>>>> > > >> >>>>>> } > > >> >>>>>> > > >> >>>>>> This has the benefit that we could already get rid off the > > methods > > >> >>> from > > >> >>>>>> StateBackend which would be problematic in the new > > implementations > > >> >>> (such > > >> >>>>>> as e.g. HashMapStateBackend - what would you return there? > > null?). > > >> I > > >> >>>>>> know this would break the interface, but StateBackend is > actually > > >> >>> quite > > >> >>>>>> internal, we did it quite freely in the past, and I don't think > > >> >> there > > >> >>>>>> are many custom state implementation in the wild. And even if > > there > > >> >>> are > > >> >>>>>> some the workaround is as easy as simply adding implements > > >> >>>> SnapshotStorage. > > >> >>>>>> > > >> >>>>>> Best, > > >> >>>>>> > > >> >>>>>> Dawid > > >> >>>>>> > > >> >>>>>> On 11/09/2020 16:48, Aljoscha Krettek wrote: > > >> >>>>>>> I could try and come up with a longer name if you need it ... > > ;-) > > >> >>>>>>> > > >> >>>>>>> Aljoscha > > >> >>>>>>> > > >> >>>>>>> On 11.09.20 16:25, Seth Wiesman wrote: > > >> >>>>>>>> Having thought about it more, HashMapStateBackend has won me > > >> over. > > >> >>>> I'll > > >> >>>>>>>> update the FLIP. If there aren't any more comments I'll open > it > > >> up > > >> >>> for > > >> >>>>>>>> voting on monday. > > >> >>>>>>>> > > >> >>>>>>>> Seth > > >> >>>>>>>> > > >> >>>>>>>> On Wed, Sep 9, 2020 at 9:09 AM Seth Wiesman < > > sjwies...@gmail.com > > >> > > > >> >>>>>> wrote: > > >> >>>>>>>>> @Yun yes, this is really about making CheckpointStorage an > > >> >>> orthogonal > > >> >>>>>>>>> concept. I think we can remain pragmatic and keep > > state-backend > > >> >>>>>>>>> specific > > >> >>>>>>>>> configurations (async, incremental, etc) in the state > backend > > >> >>>>>>>>> themselves. I > > >> >>>>>>>>> view these as more advanced configurations and by the time > > >> >> someone > > >> >>> is > > >> >>>>>>>>> changing the defaults they likely understand what is going > on. > > >> My > > >> >>>>>>>>> goal is > > >> >>>>>>>>> to help on-board users and so long as each state backend > has a > > >> >>> no-arg > > >> >>>>>>>>> default constructor that works for many users I think we've > > >> >>> achieved > > >> >>>>>>>>> that > > >> >>>>>>>>> goal. > > >> >>>>>>>>> > > >> >>>>>>>>> Regarding the checkpoint coordinator, that makes sense but I > > >> will > > >> >>>>>>>>> consider > > >> >>>>>>>>> out of the scope of this FLIP. I want to focus on > simplifying > > >> >> APIs. > > >> >>>>>>>>> > > >> >>>>>>>>> @Aljoscha Krettek <aljos...@apache.org> > > >> >>>>>>>>> > > >> >>>>>>>>> My feeling is that state backends and checkpointing are > going > > to > > >> >> be > > >> >>>>>>>>> integral to Flink for many years, regardless or other > > >> >> enhancements > > >> >>>>>>>>> so this > > >> >>>>>>>>> change is still valuable. > > >> >>>>>>>>> > > >> >>>>>>>>> Since this is a FLIP about improving the user api I'm happy > to > > >> >>>> bikeshed > > >> >>>>>>>>> the names a little more than normal. HashMap makes sense, my > > >> >> other > > >> >>>>>>>>> thought > > >> >>>>>>>>> was InMemory. > > >> >>>>>>>>> > > >> >>>>>>>>> Seth > > >> >>>>>>>>> > > >> >>>>>>>>> > > >> >>>>>>>>> > > >> >>>>>>>>> On Wed, Sep 9, 2020 at 8:04 AM Aljoscha Krettek < > > >> >>> aljos...@apache.org > > >> >>>>> > > >> >>>>>>>>> wrote: > > >> >>>>>>>>> > > >> >>>>>>>>>> I like it a lot! > > >> >>>>>>>>>> > > >> >>>>>>>>>> I think it makes sense to clean this up despite the planned > > new > > >> >>>>>>>>>> fault-tolerance mechanisms. In the future, users will > decide > > >> >> which > > >> >>>>>>>>>> mechanism to use and I can imagine that a lot of them will > > keep > > >> >>>> using > > >> >>>>>>>>>> the current mechanism for quite a while to come. But I'm > > happy > > >> >> to > > >> >>>>>>>>>> yield > > >> >>>>>>>>>> to Stephan's opinion here, he knows more about the progress > > of > > >> >>> that > > >> >>>>>>>>>> work. > > >> >>>>>>>>>> > > >> >>>>>>>>>> The one nitpick I have is about naming: will users > understand > > >> >>>>>>>>>> OnHeapStateBackend? I mean, do they know what > > on-heap/off-heap > > >> >>>>>>>>>> memory is > > >> >>>>>>>>>> and the tradeoffs? An alternative could be > > HashMapStateBackend, > > >> >>>>>>>>>> because > > >> >>>>>>>>>> that's essentially what it is. I wouldn't block anything on > > >> >> this, > > >> >>>>>>>>>> though. > > >> >>>>>>>>>> > > >> >>>>>>>>>> Aljoscha > > >> >>>>>>>>>> > > >> >>>>>>>>>> On 09.09.20 10:05, Konstantin Knauf wrote: > > >> >>>>>>>>>>> Thanks for the initiative. Big +1. Would be interested to > > hear > > >> >> if > > >> >>>> the > > >> >>>>>>>>>>> proposed interfaces still make sense in the face of the > new > > >> >>>>>>>>>> fault-tolerance > > >> >>>>>>>>>>> work that is planned. Stephan/Piotr will know. > > >> >>>>>>>>>>> > > >> >>>>>>>>>>> On Tue, Sep 8, 2020 at 7:05 PM Seth Wiesman < > > >> >> sjwies...@gmail.com > > >> >>>> > > >> >>>>>>>>>> wrote: > > >> >>>>>>>>>>>> Hi Devs, > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> I'd like to propose an update to how state backends and > > >> >>> checkpoint > > >> >>>>>>>>>> storage > > >> >>>>>>>>>>>> are configured to help users better understand Flink. > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> Apache Flink's durability story is a mystery to many > users. > > >> >> One > > >> >>>>>>>>>>>> of the > > >> >>>>>>>>>> most > > >> >>>>>>>>>>>> common recurring questions from users comes from not > > >> >>>>>>>>>>>> understanding the > > >> >>>>>>>>>>>> relationship between state, state backends, and > snapshots. > > >> >> Some > > >> >>>>>>>>>>>> of this > > >> >>>>>>>>>>>> confusion can be abated with learning material but the > > >> >> question > > >> >>>>>>>>>>>> is so > > >> >>>>>>>>>>>> pervasive that we believe Flink’s user APIs should be > > better > > >> >>>>>>>>>> communicate > > >> >>>>>>>>>>>> what different components are responsible for. > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>> > > >> >>>> > > >> >>> > > >> >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> I look forward to a healthy discussion. > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>>> Seth > > >> >>>>>>>>>>>> > > >> >>>>>>>>>>> > > >> >>>>>>>>>> > > >> >>>>>> > > >> >>>> > > >> >>>> > > >> >>> > > >> >> > > >> > > > >> > > >> > > >