There is a good number of precedents that introduced backwards
incompatible changes to that interface (which is PublicEvolving btw). We
introduced a couple of additional arguments to the
createKeyedStateBackend method and later on removed the methods with
default implementation for backwards compatibility. I want to introduce
a backward incompatible change in FLIP-140 (replace the
AbstractKeyedStateBackend with an interface). From my perspective we
should just do these changes. The impact should be minimal.

Best,

Dawid


On 15/09/2020 15:20, Seth Wiesman wrote:
> Hey Dawid,
>
> I didn't want to break compatibility but if there is precedent and everyone
> is ok with it then I'm +1.
>
> Seth
>
> On Tue, Sep 15, 2020 at 2:22 AM Dawid Wysakowicz <dwysakow...@apache.org>
> wrote:
>
>> Sorry for joining so late.
>>
>> Generally speaking I like this idea very much!
>>
>> I have one idea about the StateBackend interface. Could we instead of
>> adding a flag method boolean isLegacyStateBackend remove the
>> checkpointstorage related methods from StateBackend right away? The
>> old/legacy implementations could then implement both StateBackend and
>> SnapshotStorage. In turn in the method env.setStateBackend we could do:
>>
>> setStateBackend(StateBackend backend) {
>>
>>     this.stateBackend = backend;
>>
>>     if (backend instanceof SnapshotStorage) {
>>
>>          this.setSnapshotStorage(backend);
>>
>>     }
>>
>> }
>>
>> This has the benefit that we could already get rid off the methods from
>> StateBackend which would be problematic in the new implementations (such
>> as e.g. HashMapStateBackend - what would you return there? null?). I
>> know this would break the interface, but StateBackend is actually quite
>> internal, we did it quite freely in the past, and I don't think there
>> are many custom state implementation in the wild. And even if there are
>> some the workaround is as easy as simply adding implements SnapshotStorage.
>>
>> Best,
>>
>> Dawid
>>
>> On 11/09/2020 16:48, Aljoscha Krettek wrote:
>>> I could try and come up with a longer name if you need it ... ;-)
>>>
>>> Aljoscha
>>>
>>> On 11.09.20 16:25, Seth Wiesman wrote:
>>>> Having thought about it more, HashMapStateBackend has won me over. I'll
>>>> update the FLIP. If there aren't any more comments I'll open it up for
>>>> voting on monday.
>>>>
>>>> Seth
>>>>
>>>> On Wed, Sep 9, 2020 at 9:09 AM Seth Wiesman <sjwies...@gmail.com>
>> wrote:
>>>>> @Yun yes, this is really about making CheckpointStorage an orthogonal
>>>>> concept. I think we can remain pragmatic and keep state-backend
>>>>> specific
>>>>> configurations (async, incremental, etc) in the state backend
>>>>> themselves. I
>>>>> view these as more advanced configurations and by the time someone is
>>>>> changing the defaults they likely understand what is going on. My
>>>>> goal is
>>>>> to help on-board users and so long as each state backend has a no-arg
>>>>> default constructor that works for many users I think we've achieved
>>>>> that
>>>>> goal.
>>>>>
>>>>> Regarding the checkpoint coordinator, that makes sense but I will
>>>>> consider
>>>>> out of the scope of this FLIP. I want to focus on simplifying APIs.
>>>>>
>>>>> @Aljoscha Krettek <aljos...@apache.org>
>>>>>
>>>>> My feeling is that state backends and checkpointing are going to be
>>>>> integral to Flink for many years, regardless or other enhancements
>>>>> so this
>>>>> change is still valuable.
>>>>>
>>>>> Since this is a FLIP about improving the user api I'm happy to bikeshed
>>>>> the names a little more than normal. HashMap makes sense, my other
>>>>> thought
>>>>> was InMemory.
>>>>>
>>>>> Seth
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Sep 9, 2020 at 8:04 AM Aljoscha Krettek <aljos...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> I like it a lot!
>>>>>>
>>>>>> I think it makes sense to clean this up despite the planned new
>>>>>> fault-tolerance mechanisms. In the future, users will decide which
>>>>>> mechanism to use and I can imagine that a lot of them will keep using
>>>>>> the current mechanism for quite a while to come. But I'm happy to
>>>>>> yield
>>>>>> to Stephan's opinion here, he knows more about the progress of that
>>>>>> work.
>>>>>>
>>>>>> The one nitpick I have is about naming: will users understand
>>>>>> OnHeapStateBackend? I mean, do they know what on-heap/off-heap
>>>>>> memory is
>>>>>> and the tradeoffs? An alternative could be HashMapStateBackend,
>>>>>> because
>>>>>> that's essentially what it is. I wouldn't block anything on this,
>>>>>> though.
>>>>>>
>>>>>> Aljoscha
>>>>>>
>>>>>> On 09.09.20 10:05, Konstantin Knauf wrote:
>>>>>>> Thanks for the initiative. Big +1. Would be interested to hear if the
>>>>>>> proposed interfaces still make sense in the face of the new
>>>>>> fault-tolerance
>>>>>>> work that is planned. Stephan/Piotr will know.
>>>>>>>
>>>>>>> On Tue, Sep 8, 2020 at 7:05 PM Seth Wiesman <sjwies...@gmail.com>
>>>>>> wrote:
>>>>>>>> Hi Devs,
>>>>>>>>
>>>>>>>> I'd like to propose an update to how state backends and checkpoint
>>>>>> storage
>>>>>>>> are configured to help users better understand Flink.
>>>>>>>>
>>>>>>>> Apache Flink's durability story is a mystery to many users. One
>>>>>>>> of the
>>>>>> most
>>>>>>>> common recurring questions from users comes from not
>>>>>>>> understanding the
>>>>>>>> relationship between state, state backends, and snapshots. Some
>>>>>>>> of this
>>>>>>>> confusion can be abated with learning material but the question
>>>>>>>> is so
>>>>>>>> pervasive that we believe Flink’s user APIs should be better
>>>>>> communicate
>>>>>>>> what different components are responsible for.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing
>>>>>>>>
>>>>>>>> I look forward to a healthy discussion.
>>>>>>>>
>>>>>>>>
>>>>>>>> Seth
>>>>>>>>
>>>>>>>
>>>>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to