Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Stephan Ewen Mon, 21 Sep 2020 11:42:47 -0700

To me, the simplifications made by Seth sound good and do make a lot of
sense.
We should really break this down to a few orthogonal guides, then it is
easy for users:


  - Metadata always goes through the JobManager, no matter what
CheckpointStorage.
  - The JobManagerCheckpointStorage has the option to put the metadata on a
file system, to make it externally accessible/addressable, and for master
failover (HA).
  - The file state size threshold is the threshold where data is stored
inline with the metadata, rather than in a separate file. Whatever the
JobManager does with metadata, it is orthogonal.




On Mon, Sep 21, 2020 at 3:59 PM Seth Wiesman <sjwies...@gmail.com> wrote:

> Hi Yu,
>
> Let me address your comments one at a time.
>
> I think I can address comments one and two with a single answer. This FLIP
> does not change any runtime data structures or implementations. As such, it
> only provides new user-facing factory classes for those components.
> StateBackend (the interface) is effectively a factory for both checkpoint
> storage and state backends today and I want to move the checkpoint storage
> methods to a new factory interface CheckpointStorage.
>
> What this means is JobManagerCheckpointStorage is a factory for
> MemoryBackendCheckpointStorage and FileSystemCheckpointStorage is a factory
> for FsCheckpointStorage[1]. All configurations from these classes will be
> made available via the new public APIs. I will make that clear in the
> document. The semantics of how and where they checkpoint will be dictated
> by these runtime classes whose implementations are not to be changed.
>
> Regarding three, yes, a part of any FLIP is to update the documentation and
> as such, I am not going to explicitly outline it in the document.
>
> [1] We might want to rename MemoryBackendCheckpointStorage and
> FsCheckpointStorage but these are internal classes and as such that
> discussion does not need to be a part of the FLIP process.
>
> On Mon, Sep 21, 2020 at 2:40 AM Yu Li <car...@gmail.com> wrote:
>
> > Thanks for the update Seth, and let me further clarify my comments /
> > concerns around the new `CheckpointStorage`.
> >
> > 1. In the existing `MemoryStateBackend`, there's a `maxStateSize` field
> > which limits the maximal state size sent to JM from one single memory
> > backend, with the default size of 5MB. Please make sure to extract this
> > limitation out and keep it when implementing the new
> > `JobManagerCheckpointStorage` (as well as writing this down in our FLIP
> > document).
> >
> > 2. We need to confirm the semantic for `JobManagerCheckpointStorage`.
> >     - Currently in `MemoryBackendCheckpointStorage` we will
> >       a) send the checkpoint data to JM and persist it to the remote FS
> > (included in metadata) if checkpoint path is given, or
> >       b) send the checkpoint data to JM w/o persistency if no checkpoint
> > path given
> >     Does `JobManagerCheckpointStorage` mean checkpoint data will be sent
> to
> > JM first and JM handles everything afterwards? Literally it seems to be
> "JM
> > is the checkpoint storage and no external system required", which matches
> > only to case #b. We need to confirm this and make it clear in our FLIP
> > document (and explain to our users later).
> >
> > 3. Since now we expose the checkpoint storage concept to our users
> through
> > the `setCheckpointStorage` API, I suggest to add below notes in our
> > documents:
> >     a) `JobManagerCheckpointStorage` should be used iff the state size is
> > small enough, and users should take special care not to burst the JM
> memory
> > when using it. And JM will also persist the data to remote FS after the
> > checkpoint is globally completed (depending on our decision of the
> > semantic).
> >     b) When setting the checkpoint storage to
> > `FileSystemCheckpointStorage`, there's still chance that the checkpoint
> > data is sent to JM memory first, decided by the
> > `state.backend.fs.memory-threshold` configuration (to be honest, I find
> > this part ambiguous between JM and FS checkpoint storage)
> >
> > Please let me know your thoughts. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 18 Sep 2020 at 23:02, Seth Wiesman <sjwies...@gmail.com> wrote:
> >
> > > 1. With `FSStateBackend`, we used to decide where to store the
> checkpoint
> > > by the `state.backend.fs.memory-threshold` configuration, and we need
> to
> > > decide how to align with this behavior with the new implementation.
> > >
> > > I see this configuration available on the FileSystemStorage class. I've
> > > added that to the doc.
> > >
> > > 2. With the new implementation, since users could set checkpoint
> storage
> > > through API, do we also support the combination of
> > > `EmbeddedRocksDBStateBackend` with `JobManagerCheckpointStorage`?
> > >
> > > This is actually doable today and I don't see any reason to remove this
> > > functionality: new RocksDBStateBackend(new MemoryStateBackend())
> > >
> > > 1. There are still some `SnapshotStorage` / `JobManagerSnapshot` left
> in
> > > the code samples, please clean them up
> > >
> > > Apologies, fixed
> > >
> > > 2. Personally I'm in favor of `JobManagerCheckpointStorage` /
> > > `FileSystemCheckpointStorage` than `JobManagerStorage` /
> > `FileSystemStorage
> > >
> > > That's fine by me
> > >
> > > Seth
> > >
> > > On Fri, Sep 18, 2020 at 9:36 AM Yu Li <car...@gmail.com> wrote:
> > >
> > > > *bq. I agree with your assessment of the CheckpointStorage interface
> > but
> > > I
> > > > want to push back at including those changes as a part of this FLIP.*
> > > > Makes sense, will start a separate discussion around this topic when
> > > > prepared.
> > > >
> > > > *bq. One option could be to rename "CheckpointStorage" to
> > > > "CheckpointStorageAccess" and then use the name "CheckpointStorage"
> > > instead
> > > > of "SnapshotStorage". *
> > > > +1
> > > >
> > > > And thanks for updating the document, some comments for the new
> > version:
> > > >
> > > > Questions around migration:
> > > > 1. With `FSStateBackend`, we used to decide where to store the
> > checkpoint
> > > > by the `state.backend.fs.memory-threshold` configuration, and we need
> > to
> > > > decide how to align with this behavior with the new implementation.
> > > > 2. With the new implementation, since users could set checkpoint
> > storage
> > > > through API, do we also support the combination of
> > > > `EmbeddedRocksDBStateBackend` with `JobManagerCheckpointStorage`?
> > > >
> > > > One answer to the above questions is making
> > > > `JobManagerCheckpointStorage` internal implementation and use it as
> the
> > > > default checkpoint storage. And when user sets to
> > > > use `FileSystemCheckpointStorage`, we will still switch to
> > > > `JobManagerCheckpointStorage` when the task checkpoint size is
> smaller
> > > than
> > > > `state.backend.fs.memory-threshold`, even with RocksDB state backend.
> > > This
> > > > will align with most of the current behavior except for RocksDB
> backend
> > > > with really small checkpoint size.
> > > >
> > > > Minor issues:
> > > > 1. There are still some `SnapshotStorage` / `JobManagerSnapshot` left
> > in
> > > > the code samples, please clean them up
> > > > 2. Personally I'm in favor of `JobManagerCheckpointStorage` /
> > > > `FileSystemCheckpointStorage` than `JobManagerStorage` /
> > > > `FileSystemStorage`
> > > >
> > > > Thanks.
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > > On Fri, 18 Sep 2020 at 01:58, Seth Wiesman <sjwies...@gmail.com>
> > wrote:
> > > >
> > > > > That makes sense to me, I've updated the FLIP and also took this
> > chance
> > > > to
> > > > > make it clearer what the goals and non-goals of this proposal are.
> > > > >
> > > > > Seth
> > > > >
> > > > > On Thu, Sep 17, 2020 at 9:17 AM Stephan Ewen <se...@apache.org>
> > wrote:
> > > > >
> > > > > > Just a quick note that it should be possible to rename
> > > > > "CheckpointStorage"
> > > > > > because it is a purely internal interface.
> > > > > >
> > > > > > Looks like the "SnapshotStorage" takes some limited amount of
> > > > > functionality
> > > > > > from the "CheckpointStorage", like location pointer resolution.
> > > > > > One option could be to rename "CheckpointStorage" to
> > > > > > "CheckpointStorageAccess" and then use the name
> "CheckpointStorage"
> > > > > instead
> > > > > > of "SnapshotStorage".
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 17, 2020 at 3:47 PM Seth Wiesman <
> sjwies...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Yu,
> > > > > > >
> > > > > > > I've updated the Deprecation / Compatibility / Migration
> section
> > to
> > > > > more
> > > > > > > explicitly lay out the steps that we would take as part of this
> > > FLIP.
> > > > > It
> > > > > > > includes your above concerns.
> > > > > > >
> > > > > > > Regarding SnapshotStorage vs CheckpointStorage. I'm not sure
> > users
> > > > are
> > > > > > > going to have a problem with this. I doubt many people outside
> > this
> > > > > > thread
> > > > > > > are familiar with the CheckpointStorage interface today. Even
> > with
> > > > > these
> > > > > > > changes implemented, most users will not interact with the
> > > > > > SnapshotStorage
> > > > > > > interface. They will only ever see JobManagerStorage and
> > > > > > FileSystemStorage.
> > > > > > >
> > > > > > > I agree with your assessment of the CheckpointStorage interface
> > > but I
> > > > > > want
> > > > > > > to push back at including those changes as a part of this FLIP.
> > The
> > > > > goal
> > > > > > is
> > > > > > > to simplify users' understanding of state backends and
> > > > checkpointing. I
> > > > > > > would like to keep anything related to the runtime or internal
> > as a
> > > > > > > non-goal.
> > > > > > >
> > > > > > > Seth
> > > > > > >
> > > > > > > On Thu, Sep 17, 2020 at 3:03 AM Yu Li <car...@gmail.com>
> wrote:
> > > > > > >
> > > > > > > > Thanks for the suggestion and discussion, and sorry for being
> > > late
> > > > to
> > > > > > the
> > > > > > > > party.
> > > > > > > >
> > > > > > > > For me, +1 for the idea, but +0 for the current FLIP
> document.
> > > > > > > >
> > > > > > > > First of all, I suggest we explicitly mention the deprecation
> > of
> > > > > > existing
> > > > > > > > backends in the document. From the description, we plan to
> mark
> > > all
> > > > > > > > existing backend implementations (i.e.
> > > > > > > > RocksDBStateBackend/MemoryStateBackend/FSStateBackend) as
> > > > deprecated,
> > > > > > and
> > > > > > > > in their javadoc we should give the suggestion of migration
> to
> > > new
> > > > > > > > implementations (i.e.
> > > > > HashMapStateBackend/EmbeddedRocksDBStateBackend).
> > > > > > > >
> > > > > > > > Secondly, I suggest we explicitly mention the user-facing
> > changes
> > > > for
> > > > > > > > customized state backends.
> > > > > > > >
> > > > > > > > To be more specific, the above two should be included in the
> > > > > > > > "Compatibility, Deprecation, and Migration Plan" section. The
> > > > > existing
> > > > > > > > document already mentioned these two aspects, but IMO not
> > > explicit
> > > > > > > enough.
> > > > > > > >
> > > > > > > > Thirdly, we already have a `CheckpointStorage` interface and
> > now
> > > > > > > > introducing a new `SnapshotStoage`, and I share the same
> > concern
> > > > with
> > > > > > > > Stephan that these two interfaces might cause confusion, and
> > > > suggest
> > > > > we
> > > > > > > > discuss more about this part.
> > > > > > > >
> > > > > > > > This might sound to be a little bit off-track, but I think
> it's
> > > > > > necessary
> > > > > > > > to review the necessity of the existence of current
> > > > > > `CheckpointStorage`.
> > > > > > > It
> > > > > > > > seems to me that only JM-side logic will use interfaces in
> > > > > > > > `CheckpointStorageCoordinatorView` and only TM-side logic use
> > > > > > > > `CheckpointStorageWorkerView`, but we combine these two
> > together.
> > > > > > What's
> > > > > > > > more, if we check it carefully, we could find the signature
> of
> > > > > > > > `resolveCheckpoint` interface in current `StateBackend` and
> > > > > > > > `CheckpointStorageCoordinatorView` are exactly the same (even
> > the
> > > > > > > javadoc),
> > > > > > > > which means if we simply extract `resolveCheckpoint` out into
> > > > > > > > `SnapshotStorage`, there will be two interfaces with the same
> > > > > signature
> > > > > > > in
> > > > > > > > `SnapshotStorage` and `CheckpointStorage`, which will be
> really
> > > > > > > confusing.
> > > > > > > > Sorry but I don't have a proposal of solution yet, but I
> > suggest
> > > we
> > > > > > > figure
> > > > > > > > this out clearly.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 17 Sep 2020 at 13:19, Congxian Qiu <
> > > qcx978132...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the detailed replay, +1 from my side.
> > > > > > > > > Best,
> > > > > > > > > Congxian
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Seth Wiesman <sjwies...@gmail.com> 于2020年9月17日周四 上午1:33写道：
> > > > > > > > >
> > > > > > > > > > Hi Stephan,
> > > > > > > > > >
> > > > > > > > > > Regarding backward compatibility, I agree and the
> intention
> > > is
> > > > > that
> > > > > > > all
> > > > > > > > > > existing code will continue to function with the same
> > > > semantics.
> > > > > My
> > > > > > > > > working
> > > > > > > > > > idea is to remove the two checkpoint-storage related
> > methods
> > > > from
> > > > > > > > > > StateBackend into a new SnapshotStorage interface but
> then
> > > have
> > > > > > > > > > AbstractFileStateBackend and RocksDBStateBackend
> implement
> > > > > snapshot
> > > > > > > > > > storage. If a state backend implements SnapshotStorage it
> > > will
> > > > be
> > > > > > > used
> > > > > > > > > > unconditionally, even if a different snapshot storage
> > > > > > implementation
> > > > > > > is
> > > > > > > > > > configured. This way we don't change any of the concrete
> > > > classes
> > > > > > that
> > > > > > > > > users
> > > > > > > > > > interact with. The only people who would see breaking
> > changes
> > > > are
> > > > > > > state
> > > > > > > > > > backend implementors and they only need to add
> `implements
> > > > > > > > > SnapshotStorage`
> > > > > > > > > > to their class.
> > > > > > > > > >
> > > > > > > > > > The reason I went with SnapshotStorage is there is
> already
> > an
> > > > > > > interface
> > > > > > > > > > `org.apache.flink.runtime.state.CheckpointStorage` in
> > > > > > flink-runtime.
> > > > > > > If
> > > > > > > > > we
> > > > > > > > > > can rename this interface to something else I'm happy to
> > take
> > > > the
> > > > > > > name,
> > > > > > > > > but
> > > > > > > > > > if not I think it will lead to import confusion.
> > > > > > > > > >
> > > > > > > > > > Seth
> > > > > > > > > >
> > > > > > > > > > On Wed, Sep 16, 2020 at 11:54 AM Stephan Ewen <
> > > > se...@apache.org>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > @Yun and @Congxian:
> > > > > > > > > > >
> > > > > > > > > > > I think "async", "incremental", and similar flags
> belong
> > > very
> > > > > > much
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > state backend (the index structure).
> > > > > > > > > > > They define how the snapshotting procedure behaves.
> > > > > > > > > > >
> > > > > > > > > > > The SnapshotStorage is really just about storage of
> > > > checkpoint
> > > > > > > > streams
> > > > > > > > > > > (bytes) and handles and pointers.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Stephan
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Sep 16, 2020 at 6:48 PM Stephan Ewen <
> > > > se...@apache.org
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the great suggestion and the great
> > discussion.
> > > > > > > Generally
> > > > > > > > > big
> > > > > > > > > > > +1
> > > > > > > > > > > > to this effort.
> > > > > > > > > > > > Some thoughts from my side:
> > > > > > > > > > > >
> > > > > > > > > > > > *## Backwards Compatibility*
> > > > > > > > > > > >
> > > > > > > > > > > > I think we should really strive to make this non
> > > breaking.
> > > > > > Maybe
> > > > > > > we
> > > > > > > > > > have
> > > > > > > > > > > > new classes / interfaces for StateBackends and
> > > > > > CheckpointStorage
> > > > > > > > and
> > > > > > > > > > let
> > > > > > > > > > > > the existing State Backend classes implement both
> (and
> > > > > > deprecate
> > > > > > > > > them)?
> > > > > > > > > > > >
> > > > > > > > > > > > In the past, I have gotten some harsh comments from
> > users
> > > > > about
> > > > > > > > > > breaking
> > > > > > > > > > > > long-time effectively stable APIs, so let's try hard
> to
> > > > avoid
> > > > > > > this
> > > > > > > > > > > (unless
> > > > > > > > > > > > it makes things impossible).
> > > > > > > > > > > >
> > > > > > > > > > > > *## Naming*
> > > > > > > > > > > >
> > > > > > > > > > > > HashMapStateBackend sounds good to me
> > > > > > > > > > > >
> > > > > > > > > > > > Could we rename the SnapshotStorage to
> > CheckpointStorage?
> > > > Or
> > > > > > > > converge
> > > > > > > > > > all
> > > > > > > > > > > > methods around "Snapshot"?
> > > > > > > > > > > > I think we already have some confusion from mixing
> the
> > > > terms
> > > > > > > > > checkpoint
> > > > > > > > > > > > and snapshot and should converge in either direction.
> > > > > > > > > > > > I am slightly leaning towards converging around
> > > > checkpoints,
> > > > > > > > because
> > > > > > > > > > > > that's the most commonly known term among users as
> far
> > > as I
> > > > > can
> > > > > > > > tell.
> > > > > > > > > > > > Checkpoints are Snapshots. But one could also just
> call
> > > > them
> > > > > > > > > > Checkpoints
> > > > > > > > > > > > and let Savepoints be special Checkpoints.
> > > > > > > > > > > >
> > > > > > > > > > > > *## Integrated State / Storage Backends*
> > > > > > > > > > > >
> > > > > > > > > > > > There is an idea floating around now and then about a
> > > > > Cassandra
> > > > > > > > > backend
> > > > > > > > > > > > (or other K/V store) where the state index and
> durable
> > > > > location
> > > > > > > are
> > > > > > > > > > > tightly
> > > > > > > > > > > > intertwined.
> > > > > > > > > > > > However, I think this would not contradict, because
> it
> > > > might
> > > > > > just
> > > > > > > > > mean
> > > > > > > > > > > > that the checkpoint storage is used less (maybe only
> > for
> > > > > > > > savepoints,
> > > > > > > > > or
> > > > > > > > > > > for
> > > > > > > > > > > > WALs).
> > > > > > > > > > > >
> > > > > > > > > > > > *## Future Fault Tolerance Ideas*
> > > > > > > > > > > >
> > > > > > > > > > > > I think this conflicts with none of the future fault
> > > > > tolerance
> > > > > > > > ideas
> > > > > > > > > I
> > > > > > > > > > am
> > > > > > > > > > > > involved with.
> > > > > > > > > > > > Similar to the above, there is always some checkpoint
> > > > storage
> > > > > > > > > involved,
> > > > > > > > > > > > for example for savepoints or for
> backup/consolidation,
> > > so
> > > > no
> > > > > > > > > problem.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Stephan
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Sep 16, 2020 at 5:11 PM Aljoscha Krettek <
> > > > > > > > > aljos...@apache.org>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> I think the mentioned settings should be in the
> state
> > > > > backend.
> > > > > > > > They
> > > > > > > > > > > >> configure how a certain backend writes to a snapshot
> > > > > storage,
> > > > > > > but
> > > > > > > > > it’s
> > > > > > > > > > > >> still the backend that has the logic and decides.
> > > > > > > > > > > >>
> > > > > > > > > > > >> I think it's a good point, though, to be conscious
> > about
> > > > > those
> > > > > > > > > > settings.
> > > > > > > > > > > >> I'm sure we can figure out the details during
> > > > > implementation,
> > > > > > > > > though.
> > > > > > > > > > > >>
> > > > > > > > > > > >> Best,
> > > > > > > > > > > >> Aljoscha
> > > > > > > > > > > >>
> > > > > > > > > > > >> On 16.09.20 16:54, Seth Wiesman wrote:
> > > > > > > > > > > >> > Hi Congxian,
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > There is an allusion to those configs in the wiki
> > but
> > > > let
> > > > > me
> > > > > > > > > better
> > > > > > > > > > > >> spell
> > > > > > > > > > > >> > out my thinking. The flink-conf configurations
> will
> > > not
> > > > > > change
> > > > > > > > > and I
> > > > > > > > > > > >> > believe the java code switches should remain on
> the
> > > > state
> > > > > > > > backend
> > > > > > > > > > > >> objects.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > We are of course not fully disentangling state
> > > backends
> > > > > from
> > > > > > > > > > snapshots
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > these configurations affect how your state backend
> > > runs
> > > > in
> > > > > > > > > > > production. I
> > > > > > > > > > > >> > believe users would find it strange to have
> > > > configurations
> > > > > > > like
> > > > > > > > > > > >> >
> > `state.backend.rocksdb.checkpoint.transfer.thred.num`
> > > > not
> > > > > be
> > > > > > > > part
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > >> > EmbeddedRocksdbStateBackend but somewhere else.
> This
> > > > then
> > > > > > > leads
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > >> > question, is it better to split configurations
> > between
> > > > > > > multiple
> > > > > > > > > > places
> > > > > > > > > > > >> or
> > > > > > > > > > > >> > not. Users appreciate consistency, and so having
> all
> > > the
> > > > > > > > > > > configurations
> > > > > > > > > > > >> on
> > > > > > > > > > > >> > the state backend objects makes them more
> > discoverable
> > > > and
> > > > > > > your
> > > > > > > > > > > >> application
> > > > > > > > > > > >> > easier to reason about.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Additionally, I view these as advanced
> > configurations.
> > > > My
> > > > > > hope
> > > > > > > > is
> > > > > > > > > > most
> > > > > > > > > > > >> > users can simply use the no-arg constructor for a
> > > state
> > > > > > > backend
> > > > > > > > in
> > > > > > > > > > > >> > production. If a user is changing the number of
> > > rocksdb
> > > > > > > transfer
> > > > > > > > > > > >> threads or
> > > > > > > > > > > >> > disabling async checkpoints, they likely know what
> > > they
> > > > > are
> > > > > > > > doing.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Please let me know if you have any concerns or
> would
> > > > like
> > > > > to
> > > > > > > > > cancel
> > > > > > > > > > > the
> > > > > > > > > > > >> > vote.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Seth
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Wed, Sep 16, 2020 at 12:37 AM Congxian Qiu <
> > > > > > > > > > qcx978132...@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >> Sorry for jump late in.
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >> I like the separation here, this separation makes
> > > more
> > > > > user
> > > > > > > > > > friendly
> > > > > > > > > > > >> now.
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >> I just wonder how the configuration such as
> > > > > > > > > > > >> 'state.backend.incremental',
> > > > > > > > > > > >> >> 'state.backend.async' and
> > > > > > > > > > > >> >>
> > `state.backend.rocksdb.checkpoint.transfer.thred.num`
> > > > > will
> > > > > > be
> > > > > > > > > > > >> configured
> > > > > > > > > > > >> >> after the separation, I think these
> configurations
> > > are
> > > > > more
> > > > > > > > > related
> > > > > > > > > > > to
> > > > > > > > > > > >> >> snapshots (maybe a little strange to configure
> > these
> > > on
> > > > > > > > > > statebackend
> > > > > > > > > > > >> side).
> > > > > > > > > > > >> >> did not see this on the FLIP wiki currently.
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >> Best,
> > > > > > > > > > > >> >> Congxian
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >> Seth Wiesman <sjwies...@gmail.com> 于2020年9月15日周二
> > > > > 下午9:51写道：
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >>> Sounds good to me. I'll update the FLIP.
> > > > > > > > > > > >> >>>
> > > > > > > > > > > >> >>> On Tue, Sep 15, 2020 at 8:35 AM Dawid
> Wysakowicz <
> > > > > > > > > > > >> dwysakow...@apache.org
> > > > > > > > > > > >> >>>
> > > > > > > > > > > >> >>> wrote:
> > > > > > > > > > > >> >>>
> > > > > > > > > > > >> >>>> There is a good number of precedents that
> > > introduced
> > > > > > > > backwards
> > > > > > > > > > > >> >>>> incompatible changes to that interface (which
> is
> > > > > > > > PublicEvolving
> > > > > > > > > > > btw).
> > > > > > > > > > > >> >> We
> > > > > > > > > > > >> >>>> introduced a couple of additional arguments to
> > the
> > > > > > > > > > > >> >>>> createKeyedStateBackend method and later on
> > removed
> > > > the
> > > > > > > > methods
> > > > > > > > > > > with
> > > > > > > > > > > >> >>>> default implementation for backwards
> > > compatibility. I
> > > > > > want
> > > > > > > to
> > > > > > > > > > > >> introduce
> > > > > > > > > > > >> >>>> a backward incompatible change in FLIP-140
> > (replace
> > > > the
> > > > > > > > > > > >> >>>> AbstractKeyedStateBackend with an interface).
> > From
> > > my
> > > > > > > > > perspective
> > > > > > > > > > > we
> > > > > > > > > > > >> >>>> should just do these changes. The impact should
> > be
> > > > > > minimal.
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>> Best,
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>> Dawid
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>> On 15/09/2020 15:20, Seth Wiesman wrote:
> > > > > > > > > > > >> >>>>> Hey Dawid,
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>> I didn't want to break compatibility but if
> > there
> > > is
> > > > > > > > precedent
> > > > > > > > > > and
> > > > > > > > > > > >> >>>> everyone
> > > > > > > > > > > >> >>>>> is ok with it then I'm +1.
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>> Seth
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>> On Tue, Sep 15, 2020 at 2:22 AM Dawid
> > Wysakowicz <
> > > > > > > > > > > >> >>> dwysakow...@apache.org
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>> wrote:
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>>> Sorry for joining so late.
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> Generally speaking I like this idea very
> much!
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> I have one idea about the StateBackend
> > interface.
> > > > > Could
> > > > > > > we
> > > > > > > > > > > instead
> > > > > > > > > > > >> >> of
> > > > > > > > > > > >> >>>>>> adding a flag method boolean
> > isLegacyStateBackend
> > > > > > remove
> > > > > > > > the
> > > > > > > > > > > >> >>>>>> checkpointstorage related methods from
> > > StateBackend
> > > > > > right
> > > > > > > > > away?
> > > > > > > > > > > The
> > > > > > > > > > > >> >>>>>> old/legacy implementations could then
> implement
> > > > both
> > > > > > > > > > StateBackend
> > > > > > > > > > > >> >> and
> > > > > > > > > > > >> >>>>>> SnapshotStorage. In turn in the method
> > > > > > > env.setStateBackend
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > >> >>> do:
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> setStateBackend(StateBackend backend) {
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>>      this.stateBackend = backend;
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>>      if (backend instanceof SnapshotStorage)
> {
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>>           this.setSnapshotStorage(backend);
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>>      }
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> }
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> This has the benefit that we could already
> get
> > > rid
> > > > > off
> > > > > > > the
> > > > > > > > > > > methods
> > > > > > > > > > > >> >>> from
> > > > > > > > > > > >> >>>>>> StateBackend which would be problematic in
> the
> > > new
> > > > > > > > > > > implementations
> > > > > > > > > > > >> >>> (such
> > > > > > > > > > > >> >>>>>> as e.g. HashMapStateBackend - what would you
> > > return
> > > > > > > there?
> > > > > > > > > > > null?).
> > > > > > > > > > > >> I
> > > > > > > > > > > >> >>>>>> know this would break the interface, but
> > > > StateBackend
> > > > > > is
> > > > > > > > > > actually
> > > > > > > > > > > >> >>> quite
> > > > > > > > > > > >> >>>>>> internal, we did it quite freely in the past,
> > > and I
> > > > > > don't
> > > > > > > > > think
> > > > > > > > > > > >> >> there
> > > > > > > > > > > >> >>>>>> are many custom state implementation in the
> > wild.
> > > > And
> > > > > > > even
> > > > > > > > if
> > > > > > > > > > > there
> > > > > > > > > > > >> >>> are
> > > > > > > > > > > >> >>>>>> some the workaround is as easy as simply
> adding
> > > > > > > implements
> > > > > > > > > > > >> >>>> SnapshotStorage.
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> Best,
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> Dawid
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>>> On 11/09/2020 16:48, Aljoscha Krettek wrote:
> > > > > > > > > > > >> >>>>>>> I could try and come up with a longer name
> if
> > > you
> > > > > need
> > > > > > > it
> > > > > > > > > ...
> > > > > > > > > > > ;-)
> > > > > > > > > > > >> >>>>>>>
> > > > > > > > > > > >> >>>>>>> Aljoscha
> > > > > > > > > > > >> >>>>>>>
> > > > > > > > > > > >> >>>>>>> On 11.09.20 16:25, Seth Wiesman wrote:
> > > > > > > > > > > >> >>>>>>>> Having thought about it more,
> > > HashMapStateBackend
> > > > > has
> > > > > > > won
> > > > > > > > > me
> > > > > > > > > > > >> over.
> > > > > > > > > > > >> >>>> I'll
> > > > > > > > > > > >> >>>>>>>> update the FLIP. If there aren't any more
> > > > comments
> > > > > > I'll
> > > > > > > > > open
> > > > > > > > > > it
> > > > > > > > > > > >> up
> > > > > > > > > > > >> >>> for
> > > > > > > > > > > >> >>>>>>>> voting on monday.
> > > > > > > > > > > >> >>>>>>>>
> > > > > > > > > > > >> >>>>>>>> Seth
> > > > > > > > > > > >> >>>>>>>>
> > > > > > > > > > > >> >>>>>>>> On Wed, Sep 9, 2020 at 9:09 AM Seth
> Wiesman <
> > > > > > > > > > > sjwies...@gmail.com
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >>>>>> wrote:
> > > > > > > > > > > >> >>>>>>>>> @Yun yes, this is really about making
> > > > > > > CheckpointStorage
> > > > > > > > an
> > > > > > > > > > > >> >>> orthogonal
> > > > > > > > > > > >> >>>>>>>>> concept. I think we can remain pragmatic
> and
> > > > keep
> > > > > > > > > > > state-backend
> > > > > > > > > > > >> >>>>>>>>> specific
> > > > > > > > > > > >> >>>>>>>>> configurations (async, incremental, etc)
> in
> > > the
> > > > > > state
> > > > > > > > > > backend
> > > > > > > > > > > >> >>>>>>>>> themselves. I
> > > > > > > > > > > >> >>>>>>>>> view these as more advanced configurations
> > and
> > > > by
> > > > > > the
> > > > > > > > time
> > > > > > > > > > > >> >> someone
> > > > > > > > > > > >> >>> is
> > > > > > > > > > > >> >>>>>>>>> changing the defaults they likely
> understand
> > > > what
> > > > > is
> > > > > > > > going
> > > > > > > > > > on.
> > > > > > > > > > > >> My
> > > > > > > > > > > >> >>>>>>>>> goal is
> > > > > > > > > > > >> >>>>>>>>> to help on-board users and so long as each
> > > state
> > > > > > > backend
> > > > > > > > > > has a
> > > > > > > > > > > >> >>> no-arg
> > > > > > > > > > > >> >>>>>>>>> default constructor that works for many
> > users
> > > I
> > > > > > think
> > > > > > > > > we've
> > > > > > > > > > > >> >>> achieved
> > > > > > > > > > > >> >>>>>>>>> that
> > > > > > > > > > > >> >>>>>>>>> goal.
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> Regarding the checkpoint coordinator, that
> > > makes
> > > > > > sense
> > > > > > > > > but I
> > > > > > > > > > > >> will
> > > > > > > > > > > >> >>>>>>>>> consider
> > > > > > > > > > > >> >>>>>>>>> out of the scope of this FLIP. I want to
> > focus
> > > > on
> > > > > > > > > > simplifying
> > > > > > > > > > > >> >> APIs.
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> @Aljoscha Krettek <aljos...@apache.org>
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> My feeling is that state backends and
> > > > > checkpointing
> > > > > > > are
> > > > > > > > > > going
> > > > > > > > > > > to
> > > > > > > > > > > >> >> be
> > > > > > > > > > > >> >>>>>>>>> integral to Flink for many years,
> regardless
> > > or
> > > > > > other
> > > > > > > > > > > >> >> enhancements
> > > > > > > > > > > >> >>>>>>>>> so this
> > > > > > > > > > > >> >>>>>>>>> change is still valuable.
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> Since this is a FLIP about improving the
> > user
> > > > api
> > > > > > I'm
> > > > > > > > > happy
> > > > > > > > > > to
> > > > > > > > > > > >> >>>> bikeshed
> > > > > > > > > > > >> >>>>>>>>> the names a little more than normal.
> HashMap
> > > > makes
> > > > > > > > sense,
> > > > > > > > > my
> > > > > > > > > > > >> >> other
> > > > > > > > > > > >> >>>>>>>>> thought
> > > > > > > > > > > >> >>>>>>>>> was InMemory.
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> Seth
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>> On Wed, Sep 9, 2020 at 8:04 AM Aljoscha
> > > Krettek
> > > > <
> > > > > > > > > > > >> >>> aljos...@apache.org
> > > > > > > > > > > >> >>>>>
> > > > > > > > > > > >> >>>>>>>>> wrote:
> > > > > > > > > > > >> >>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>> I like it a lot!
> > > > > > > > > > > >> >>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>> I think it makes sense to clean this up
> > > despite
> > > > > the
> > > > > > > > > planned
> > > > > > > > > > > new
> > > > > > > > > > > >> >>>>>>>>>> fault-tolerance mechanisms. In the
> future,
> > > > users
> > > > > > will
> > > > > > > > > > decide
> > > > > > > > > > > >> >> which
> > > > > > > > > > > >> >>>>>>>>>> mechanism to use and I can imagine that a
> > lot
> > > > of
> > > > > > them
> > > > > > > > > will
> > > > > > > > > > > keep
> > > > > > > > > > > >> >>>> using
> > > > > > > > > > > >> >>>>>>>>>> the current mechanism for quite a while
> to
> > > > come.
> > > > > > But
> > > > > > > > I'm
> > > > > > > > > > > happy
> > > > > > > > > > > >> >> to
> > > > > > > > > > > >> >>>>>>>>>> yield
> > > > > > > > > > > >> >>>>>>>>>> to Stephan's opinion here, he knows more
> > > about
> > > > > the
> > > > > > > > > progress
> > > > > > > > > > > of
> > > > > > > > > > > >> >>> that
> > > > > > > > > > > >> >>>>>>>>>> work.
> > > > > > > > > > > >> >>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>> The one nitpick I have is about naming:
> > will
> > > > > users
> > > > > > > > > > understand
> > > > > > > > > > > >> >>>>>>>>>> OnHeapStateBackend? I mean, do they know
> > what
> > > > > > > > > > > on-heap/off-heap
> > > > > > > > > > > >> >>>>>>>>>> memory is
> > > > > > > > > > > >> >>>>>>>>>> and the tradeoffs? An alternative could
> be
> > > > > > > > > > > HashMapStateBackend,
> > > > > > > > > > > >> >>>>>>>>>> because
> > > > > > > > > > > >> >>>>>>>>>> that's essentially what it is. I wouldn't
> > > block
> > > > > > > > anything
> > > > > > > > > on
> > > > > > > > > > > >> >> this,
> > > > > > > > > > > >> >>>>>>>>>> though.
> > > > > > > > > > > >> >>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>> Aljoscha
> > > > > > > > > > > >> >>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>> On 09.09.20 10:05, Konstantin Knauf
> wrote:
> > > > > > > > > > > >> >>>>>>>>>>> Thanks for the initiative. Big +1. Would
> > be
> > > > > > > interested
> > > > > > > > > to
> > > > > > > > > > > hear
> > > > > > > > > > > >> >> if
> > > > > > > > > > > >> >>>> the
> > > > > > > > > > > >> >>>>>>>>>>> proposed interfaces still make sense in
> > the
> > > > face
> > > > > > of
> > > > > > > > the
> > > > > > > > > > new
> > > > > > > > > > > >> >>>>>>>>>> fault-tolerance
> > > > > > > > > > > >> >>>>>>>>>>> work that is planned. Stephan/Piotr will
> > > know.
> > > > > > > > > > > >> >>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>> On Tue, Sep 8, 2020 at 7:05 PM Seth
> > Wiesman
> > > <
> > > > > > > > > > > >> >> sjwies...@gmail.com
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>>>>>>>> wrote:
> > > > > > > > > > > >> >>>>>>>>>>>> Hi Devs,
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>> I'd like to propose an update to how
> > state
> > > > > > backends
> > > > > > > > and
> > > > > > > > > > > >> >>> checkpoint
> > > > > > > > > > > >> >>>>>>>>>> storage
> > > > > > > > > > > >> >>>>>>>>>>>> are configured to help users better
> > > > understand
> > > > > > > Flink.
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>> Apache Flink's durability story is a
> > > mystery
> > > > to
> > > > > > > many
> > > > > > > > > > users.
> > > > > > > > > > > >> >> One
> > > > > > > > > > > >> >>>>>>>>>>>> of the
> > > > > > > > > > > >> >>>>>>>>>> most
> > > > > > > > > > > >> >>>>>>>>>>>> common recurring questions from users
> > comes
> > > > > from
> > > > > > > not
> > > > > > > > > > > >> >>>>>>>>>>>> understanding the
> > > > > > > > > > > >> >>>>>>>>>>>> relationship between state, state
> > backends,
> > > > and
> > > > > > > > > > snapshots.
> > > > > > > > > > > >> >> Some
> > > > > > > > > > > >> >>>>>>>>>>>> of this
> > > > > > > > > > > >> >>>>>>>>>>>> confusion can be abated with learning
> > > > material
> > > > > > but
> > > > > > > > the
> > > > > > > > > > > >> >> question
> > > > > > > > > > > >> >>>>>>>>>>>> is so
> > > > > > > > > > > >> >>>>>>>>>>>> pervasive that we believe Flink’s user
> > APIs
> > > > > > should
> > > > > > > be
> > > > > > > > > > > better
> > > > > > > > > > > >> >>>>>>>>>> communicate
> > > > > > > > > > > >> >>>>>>>>>>>> what different components are
> responsible
> > > > for.
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>
> > > > > > > > > > > >> >>
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-142%3A+Disentangle+StateBackends+from+Checkpointing
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>> I look forward to a healthy discussion.
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>> Seth
> > > > > > > > > > > >> >>>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>>
> > > > > > > > > > > >> >>>>>>>>>>
> > > > > > > > > > > >> >>>>>>
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>>
> > > > > > > > > > > >> >>>
> > > > > > > > > > > >> >>
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-142: Disentangle StateBackends from Checkpointing

Reply via email to