Re: [DISCUSS][Statebackend][Runtime] Changelog Statebackend Configuration Proposal

Yuan Mei Wed, 09 Jun 2021 04:21:16 -0700

Thank you everyone for replying!

Option 3 wins with dominating # of votes + mine.


This option works as a refined version of the original proposal in
FLIP-158: Generalized incremental checkpoints [1]:
  - Define consistent override and combination policy (flag + state
backend) in different config levels
  - Define explicitly the meaning of "enable flag" = true/false/unset
  - Hide ChangelogStateBackend from users

According to the discussion in this thread, we will go with
Option 3: Enable Changelog Statebackend through a Boolean Flag + W/O
ChangelogStateBackend Exposed

 [1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints

Best
Yuan

On Tue, Jun 8, 2021 at 6:40 PM Yu Li <car...@gmail.com> wrote:

> +1 for option 3.
>
> IMHO persisting (operator's) state data through change log is an
> independent mechanism which could co-work with all kinds of local state
> stores (heap and rocksdb). This mechanism is similar to the WAL
> (write-ahead-log) mechanism in the database system. Although implement-wise
> we're using wrapper (decorator) pattern and naming it as
> `ChangeLogStateBackend`, it's not really another type of state backend. For
> the same reason, ChangeLogStateBackend should be an internal class and not
> exposed to the end user. Users only need to know / control whether to
> enable change log or not, just like whether to enable WAL in the
> traditional database system.
>
> Thanks.
>
> Best Regards,
> Yu
>
>
> On Thu, 3 Jun 2021 at 22:50, Piotr Nowojski <pnowoj...@apache.org> wrote:
>
> > Hi,
> >
> > I would actually prefer option 6 (or 5/4), for the sake of configuration
> > being explicit and self explanatory. But at the same time I don't have
> very
> > hard preferences and from the remaining options, option 3 seems the most
> > reasonable.
> >
> > The question would be, do we want to expose to the users that
> > ChangeLogStateBackend is wrapping an inner state backend or not? If not,
> > option 3 is the best. If we do, if we want to teach the users and help
> them
> > build the understanding of how things are working underneath, option 5
> or 6
> > are better.
> >
> > Best,
> > Piotrek
> >
> > śr., 2 cze 2021 o 04:36 Yun Tang <myas...@live.com> napisał(a):
> >
> > > Hi Yuan, thanks for launching this discussion.
> > >
> > > I prefer option-3 as this is the easiest to understand for users.
> > >
> > >
> > > Best
> > > Yun Tang
> > > ________________________________
> > > From: Roman Khachatryan <ro...@apache.org>
> > > Sent: Monday, May 31, 2021 16:53
> > > To: dev <dev@flink.apache.org>
> > > Subject: Re: [DISCUSS][Statebackend][Runtime] Changelog Statebackend
> > > Configuration Proposal
> > >
> > > Hey Yuan, thanks for the proposal
> > >
> > > I think Option 3 is the simplest to use and exposes less details than
> any
> > > other.
> > > It's also consistent with the current way of configuring state
> > > backends, as long as we treat change logging as a common feature
> > > applicable to any state backend, like e.g.
> > > state.backend.local-recovery.
> > >
> > > Option 6 seems slightly less preferable as it exposes more details but
> > > I think is the most viable alternative.
> > >
> > > Regards,
> > > Roman
> > >
> > >
> > > On Mon, May 31, 2021 at 8:39 AM Yuan Mei <yuanmei.w...@gmail.com>
> wrote:
> > > >
> > > > Hey all,
> > > >
> > > > We would like to start a discussion on how to enable/config Changelog
> > > > Statebakcend.
> > > >
> > > > As part of FLIP-158[1], Changelog state backend wraps on top of
> > existing
> > > > state backend (HashMapStateBackend, EmbeddedRocksDBStateBackend and
> may
> > > > expect more) and delegates state changes to the underlying state
> > > backends.
> > > > This thread is to discuss the problem of how Changelog StateBackend
> > > should
> > > > be enabled and configured.
> > > >
> > > > Proposed options to enable/config state changelog is listed below:
> > > >
> > > > Option 1: Enable Changelog Statebackend through a Boolean Flag
> > > >
> > > > Option 2: Enable Changelog Statebackend through a Boolean Flag + a
> > > Special
> > > > Case
> > > >
> > > > Option 3: Enable Changelog Statebackend through a Boolean Flag + W/O
> > > > ChangelogStateBackend Exposed
> > > >
> > > > Option 4: Explicit Nested Configuration + “changelog.inner” prefix
> for
> > > > inner backend
> > > >
> > > > Option 5: Explicit Nested Configuration + inner state backend
> > > configuration
> > > > unchanged
> > > >
> > > > Option 6: Config Changelog and Inner Statebackend All-Together
> > > >
> > > > Details of each option can be found here:
> > > >
> > >
> >
> https://docs.google.com/document/d/13AaCf5fczYTDHZ4G1mgYL685FqbnoEhgo0cdwuJlZmw/edit?usp=sharing
> > > >
> > > > When considering these options, please consider these four
> dimensions:
> > > > 1 Consistency
> > > > API/config should follow a consistent model and should not have
> > > > contradicted logic beneath
> > > > 2 Simplicity
> > > > API should be easy to use and not introduce too much burden on users
> > > > 3. Explicity
> > > > API/config should not contain implicit assumptions and should be
> > > intuitive
> > > > to users
> > > > 4. Extensibility
> > > > With foreseen future, whether the current setting can be easily
> > extended
> > > >
> > > > Please let us know what do you think and please keep the discussion
> in
> > > this
> > > > mailing thread.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints
> > > >
> > > > Best
> > > > Yuan
> > >
> >
>

Re: [DISCUSS][Statebackend][Runtime] Changelog Statebackend Configuration Proposal

Reply via email to