Hi everyone,

It seems too late to make this in 1.19, so I suggest changing it to 1.20.
Another thing I'd like to highlight is that there are some existing option
classes lacking annotations, which are:

   - org.apache.flink.configuration.CheckpointingOptions
   - org.apache.flink.configuration.StateBackendOptions
   - org.apache.flink.contrib.streaming.state.RocksDBOptions

This FLIP will annotate these existing classes with @PublicEvolving in
version 2.0, since almost all of the state-related option classes and APIs
are annotated with @PublicEvolving. Moreover, the migration period of
deprecated options in those classes will last for one minor release (1.20),
which meets the requirement of migration period for @PublicEvolving [1].
And in 2.0, they will be removed.

Based on the discussion so far, I will proceed to start the vote tomorrow.
Thanks!

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%3A+Introduce+an+API+deprecation+process


Best,
Zakelly

On Mon, Jan 22, 2024 at 6:31 PM Zakelly Lan <zakelly....@gmail.com> wrote:

> Hi everyone,
>
> It has been 6 days since the last call for discussion. I'd like to start a
> vote after another 2 days.
>
> Please let me know if you have any concerns. Thanks!
>
>
> Best,
> Zakelly
>
> On Tue, Jan 16, 2024 at 2:54 PM Zakelly Lan <zakelly....@gmail.com> wrote:
>
>> Thanks for the suggestion Rui!  The type is added.
>>
>>
>> Best,
>> Zakelly
>>
>> On Tue, Jan 16, 2024 at 2:33 PM Rui Fan <1996fan...@gmail.com> wrote:
>>
>>> Hi Zakelly,
>>>
>>> Would you mind adding the option type in the FLIP doc?
>>> For example, String, Boolean or Enum, etc. Thank you.
>>>
>>> Best,
>>> Rui
>>>
>>> On Tue, Jan 16, 2024 at 2:29 PM Zakelly Lan <zakelly....@gmail.com>
>>> wrote:
>>>
>>> > Hi everyone,
>>> >
>>> > Thanks all for joining the discussion! I'd like to speed this up since
>>> it
>>> > lasts for nearly a month. I made changes on this FLIP based on
>>> suggestions
>>> > and compromises acceptable to most people. Please feel free to give
>>> your
>>> > opinion. Thanks!
>>> > If there are no more suggestions, I will consider starting a vote
>>> within a
>>> > week.
>>> >
>>> >
>>> > Best,
>>> > Zakelly
>>> >
>>> > On Thu, Jan 11, 2024 at 10:31 AM Xuannan Su <suxuanna...@gmail.com>
>>> wrote:
>>> >
>>> > > Hi Zakelly,
>>> > >
>>> > > I am fine with either Option 2 or Option 3. I think the naming in
>>> > > Option 2 makes it clear that it is a boolean configuration. However,
>>> > > most of the currently available boolean configurations do not use
>>> > > "enable" as a suffix. Therefore, Option 3 looks good to me as it
>>> > > follows the current practice.
>>> > >
>>> > > Best regards,
>>> > > Xuannan
>>> > >
>>> > > On Thu, Jan 11, 2024 at 9:50 AM Hangxiang Yu <master...@gmail.com>
>>> > wrote:
>>> > > >
>>> > > > >
>>> > > > > That's a very good point. I realize that the word 'recovery'
>>> means
>>> > way
>>> > > too
>>> > > > > many things. So I suggest picking a more specific word here, how
>>> > about
>>> > > > > 'execution.state-recovery.*' ? Checkpointing and state recovery
>>> are
>>> > > > > corresponding terms and won't make ambiguity.
>>> > > > >
>>> > > >
>>> > > > This makes the configuration clearer to me. We could focus on the
>>> > > > `state-recovery` at first.
>>> > > >
>>> > > > I think we could create another FLIP for the deprecation of LEGACY
>>> > mode.
>>> > > > >
>>> > > >
>>> > > > LGTM, Let's create a new FLIP to do this.
>>> > > >
>>> > > > IIUC, there is no clear ownership of the local copy files from the
>>> > > previous
>>> > > > > job and it's better to define one. This needs more discussion so
>>> we
>>> > > could
>>> > > > > create another thread for this. WDYT?
>>> > > > >
>>> > > >
>>> > > > Yeah, I have created a new ticket FLINK-34032 to track and discuss
>>> > this.
>>> > > >
>>> > > > On Wed, Jan 10, 2024 at 6:31 PM Zakelly Lan <zakelly....@gmail.com
>>> >
>>> > > wrote:
>>> > > >
>>> > > > > Hi everyone,
>>> > > > >
>>> > > > > It seems we still don't have a consensus on the rules for boolean
>>> > type
>>> > > > > options. Let me recap the alternatives we have:
>>> > > > >
>>> > > > > Option 1: Use enumeration options instead if possible. But this
>>> may
>>> > > cause
>>> > > > > some name collisions or confusion as we discussed and we should
>>> unify
>>> > > the
>>> > > > > statement everywhere.
>>> > > > > Option 2: Use boolean options and add 'enabled' as the suffix.
>>> > > > > Option 3: Use boolean options and ONLY add 'enabled' when there
>>> are
>>> > > more
>>> > > > > detailed configurations under the same prefix, to prevent one
>>> name
>>> > from
>>> > > > > serving as a prefix to another.
>>> > > > >
>>> > > > > I am inclined to Option 3, since it is more in line with current
>>> > > practice
>>> > > > > and friendly for existing users. Also It reduces the length of
>>> > > > > configuration names as much as possible.
>>> > > > >
>>> > > > > Looking forward to your opinions! Thanks!
>>> > > > >
>>> > > > >
>>> > > > > Best,
>>> > > > > Zakelly
>>> > > > >
>>> > > > > On Wed, Jan 10, 2024 at 3:30 PM Zakelly Lan <
>>> zakelly....@gmail.com>
>>> > > wrote:
>>> > > > >
>>> > > > > > Hi Hangxiang,
>>> > > > > >
>>> > > > > > Thanks for your suggestions!
>>> > > > > >
>>> > > > > > 1. Could execution.recovery also contain some other behaviors
>>> about
>>> > > > > >> recovery ? e.g. restart-strategy.
>>> > > > > >
>>> > > > > >
>>> > > > > > That's a very good point. I realize that the word 'recovery'
>>> means
>>> > > way
>>> > > > > too
>>> > > > > > many things. So I suggest picking a more specific word here,
>>> how
>>> > > about
>>> > > > > > 'execution.state-recovery.*' ? Checkpointing and state
>>> recovery are
>>> > > > > > corresponding terms and won't make ambiguity.
>>> > > > > >
>>> > > > > > 2. Could we also remove some legacy configuration value ? e.g.
>>> > LEGACY
>>> > > > > Mode
>>> > > > > >> for
>>> > execution.savepoint-restore-mode/execution.recovery.claim-mode.
>>> > > > > >
>>> > > > > >
>>> > > > > > I think we could create another FLIP for the deprecation of
>>> LEGACY
>>> > > mode.
>>> > > > > >
>>> > > > > >
>>> > > > > >> 3. Could the local checkpoint be cleaned
>>> > > > > >> if execution.checkpointing.local-copy.enabled is true and
>>> > > > > >> execution.recovery.from-local is false ? I found it's also an
>>> > issue
>>> > > if
>>> > > > > >> current local-recovery from enabled to disabled. Maybe another
>>> > > ticket is
>>> > > > > >> needed.
>>> > > > > >
>>> > > > > >
>>> > > > > > IIUC, there is no clear ownership of the local copy files from
>>> the
>>> > > > > > previous job and it's better to define one. This needs more
>>> > > discussion so
>>> > > > > > we could create another thread for this. WDYT?
>>> > > > > >
>>> > > > > >
>>> > > > > > Best,
>>> > > > > > Zakelly
>>> > > > > >
>>> > > > > > On Tue, Jan 9, 2024 at 11:23 AM Hangxiang Yu <
>>> master...@gmail.com>
>>> > > > > wrote:
>>> > > > > >
>>> > > > > >> Hi, Zakelly.
>>> > > > > >> Thanks for driving this. Overall LGTM as we discussed offline.
>>> > > > > >>
>>> > > > > >> Some comments/suggestions just came to mind:
>>> > > > > >> 1. Could execution.recovery also contain some other behaviors
>>> > about
>>> > > > > >> recovery ? e.g. restart-strategy.
>>> > > > > >> 2. Could we also remove some legacy configuration value ? e.g.
>>> > > LEGACY
>>> > > > > Mode
>>> > > > > >> for
>>> > execution.savepoint-restore-mode/execution.recovery.claim-mode.
>>> > > > > >> 3. Could the local checkpoint be cleaned
>>> > > > > >> if execution.checkpointing.local-copy.enabled is true and
>>> > > > > >> execution.recovery.from-local is false ? I found it's also an
>>> > issue
>>> > > if
>>> > > > > >> current local-recovery from enabled to disabled. Maybe another
>>> > > ticket is
>>> > > > > >> needed.
>>> > > > > >> 4. +1 for enabling execution.checkpointing.incremental by
>>> default
>>> > > which
>>> > > > > is
>>> > > > > >> basically default configuration in our production environment.
>>> > > > > >>
>>> > > > > >>
>>> > > > > >> On Mon, Jan 8, 2024 at 6:06 PM Zakelly Lan <
>>> zakelly....@gmail.com
>>> > >
>>> > > > > wrote:
>>> > > > > >>
>>> > > > > >> > Hi Yun,
>>> > > > > >> >
>>> > > > > >> > Thanks for your comments!
>>> > > > > >> >
>>> > > > > >> >  1.  We shall not describe the configuration with its
>>> > > implementation
>>> > > > > for
>>> > > > > >> > > 'execution.checkpointing.local-copy.*' options, for
>>> hashmap
>>> > > > > >> > state-backend,
>>> > > > > >> > > it would write two streams and for Rocksdb state-backend,
>>> it
>>> > > would
>>> > > > > use
>>> > > > > >> > > hard-link for backup. Thus, I think
>>> > > > > >> > > 'execution.checkpointing.local-backup.*' looks better.
>>> > > > > >> >
>>> > > > > >> > I agreed that we'd better name the option in user's
>>> perspective
>>> > > > > instead
>>> > > > > >> of
>>> > > > > >> > the implementation, thus I name it as a copy of the
>>> checkpoint
>>> > in
>>> > > the
>>> > > > > >> > local disk, regardless of the way of generating it. The word
>>> > > 'backup'
>>> > > > > is
>>> > > > > >> > also suitable for this case, so I agree to change to
>>> > > > > >> > 'execution.checkpointing.local-backup.*' if no one objects.
>>> > > > > >> >
>>> > > > > >> >  2.  What does the
>>> > 'execution.checkpointing.data-inline-threshold'
>>> > > > > >> mean? It
>>> > > > > >> > > seems not so easy to understand.
>>> > > > > >> >
>>> > > > > >> > The 'execution.checkpointing.data-inline-threshold'
>>> (original
>>> > one
>>> > > as
>>> > > > > >> > 'state.storage.fs.memory-threshold') stands for the size
>>> > threshold
>>> > > > > below
>>> > > > > >> > which state chunks will store inline with the metadata,
>>> thus I
>>> > > call it
>>> > > > > >> > 'data-inline-threshold'.
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > Best,
>>> > > > > >> > Zakelly
>>> > > > > >> >
>>> > > > > >> > On Mon, Jan 8, 2024 at 10:09 AM Yun Tang <myas...@live.com>
>>> > > wrote:
>>> > > > > >> >
>>> > > > > >> > > Hi Zakelly,
>>> > > > > >> > >
>>> > > > > >> > > Thanks for driving this topic. I have two concerns here:
>>> > > > > >> > >
>>> > > > > >> > >   1.  We shall not describe the configuration with its
>>> > > > > implementation
>>> > > > > >> for
>>> > > > > >> > > 'execution.checkpointing.local-copy.*' options, for
>>> hashmap
>>> > > > > >> > state-backend,
>>> > > > > >> > > it would write two streams and for Rocksdb state-backend,
>>> it
>>> > > would
>>> > > > > use
>>> > > > > >> > > hard-link for backup. Thus, I think
>>> > > > > >> > > 'execution.checkpointing.local-backup.*' looks better.
>>> > > > > >> > >   2.  What does the
>>> > > 'execution.checkpointing.data-inline-threshold'
>>> > > > > >> mean?
>>> > > > > >> > > It seems not so easy to understand.
>>> > > > > >> > >
>>> > > > > >> > > Best
>>> > > > > >> > > Yun Tang
>>> > > > > >> > > ________________________________
>>> > > > > >> > > From: Piotr Nowojski <pnowoj...@apache.org>
>>> > > > > >> > > Sent: Thursday, January 4, 2024 22:37
>>> > > > > >> > > To: dev@flink.apache.org <dev@flink.apache.org>
>>> > > > > >> > > Subject: Re: [DISCUSS] FLIP-406: Reorganize State &
>>> > > Checkpointing &
>>> > > > > >> > > Recovery Configuration
>>> > > > > >> > >
>>> > > > > >> > > Hi,
>>> > > > > >> > >
>>> > > > > >> > > Thanks for trying to clean this up! I don't have strong
>>> > > opinions on
>>> > > > > >> the
>>> > > > > >> > > topics discussed here, so generally speaking +1 from my
>>> side!
>>> > > > > >> > >
>>> > > > > >> > > Best,
>>> > > > > >> > > Piotrek
>>> > > > > >> > >
>>> > > > > >> > > śr., 3 sty 2024 o 04:16 Rui Fan <1996fan...@gmail.com>
>>> > > napisał(a):
>>> > > > > >> > >
>>> > > > > >> > > > Thanks for the feedback!
>>> > > > > >> > > >
>>> > > > > >> > > > Using the `execution.checkpointing.incremental.enabled`,
>>> > > > > >> > > > and enabling it by default sounds good to me.
>>> > > > > >> > > >
>>> > > > > >> > > > Best,
>>> > > > > >> > > > Rui
>>> > > > > >> > > >
>>> > > > > >> > > > On Wed, Jan 3, 2024 at 11:10 AM Zakelly Lan <
>>> > > > > zakelly....@gmail.com>
>>> > > > > >> > > wrote:
>>> > > > > >> > > >
>>> > > > > >> > > > > Hi Rui,
>>> > > > > >> > > > >
>>> > > > > >> > > > > Thanks for your comments!
>>> > > > > >> > > > >
>>> > > > > >> > > > > IMO, given that the state backend can be plugably
>>> loaded
>>> > > (as you
>>> > > > > >> can
>>> > > > > >> > > > > specify a state backend factory), I prefer not
>>> providing
>>> > > state
>>> > > > > >> > backend
>>> > > > > >> > > > > specified options in the framework.
>>> > > > > >> > > > >
>>> > > > > >> > > > > Secondly, the incremental checkpoint is actually a
>>> sharing
>>> > > file
>>> > > > > >> > > strategy
>>> > > > > >> > > > > across checkpoints, which means the state backend
>>> *could*
>>> > > reuse
>>> > > > > >> files
>>> > > > > >> > > > from
>>> > > > > >> > > > > previous cp but not *must* do so. When the state
>>> backend
>>> > > could
>>> > > > > not
>>> > > > > >> > > reuse
>>> > > > > >> > > > > the files, it is reasonable to fallback to a full
>>> > > checkpoint.
>>> > > > > >> > > > >
>>> > > > > >> > > > > Thus, I suggest we make it
>>> > > `execution.checkpointing.incremental`
>>> > > > > >> and
>>> > > > > >> > > > enable
>>> > > > > >> > > > > it by default. For those state backends not supporting
>>> > this,
>>> > > > > they
>>> > > > > >> > > perform
>>> > > > > >> > > > > full checkpoints and print a warning to inform users.
>>> > Users
>>> > > do
>>> > > > > not
>>> > > > > >> > need
>>> > > > > >> > > > to
>>> > > > > >> > > > > pay special attention to different options to control
>>> this
>>> > > > > across
>>> > > > > >> > > > different
>>> > > > > >> > > > > state backends. This is more user-friendly in my
>>> opinion.
>>> > > WDYT?
>>> > > > > >> > > > >
>>> > > > > >> > > > > On Tue, Jan 2, 2024 at 10:49 AM Rui Fan <
>>> > > 1996fan...@gmail.com>
>>> > > > > >> > wrote:
>>> > > > > >> > > > >
>>> > > > > >> > > > > > Hi Zakelly,
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > I'm not sure whether we could add the state backend
>>> type
>>> > > in
>>> > > > > the
>>> > > > > >> > > > > > new key name of state.backend.incremental. It means
>>> we
>>> > use
>>> > > > > >> > > > > > `execution.checkpointing.rocksdb-incremental` or
>>> > > > > >> > > > > >
>>> `execution.checkpointing.rocksdb-incremental.enabled`.
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > So far, state.backend.incremental only works for
>>> rocksdb
>>> > > state
>>> > > > > >> > > backend.
>>> > > > > >> > > > > > And this feature or optimization is very valuable
>>> and
>>> > > huge for
>>> > > > > >> > large
>>> > > > > >> > > > > > state flink jobs. I believe it's enabled for most
>>> > > production
>>> > > > > >> flink
>>> > > > > >> > > jobs
>>> > > > > >> > > > > > with large rocksdb state.
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > If this option isn't generic for all state backend
>>> > types,
>>> > > I
>>> > > > > >> guess
>>> > > > > >> > we
>>> > > > > >> > > > > > can enable
>>> > > > > `execution.checkpointing.rocksdb-incremental.enabled`
>>> > > > > >> > > > > > by default in Flink 2.0.
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > But if it works for all state backends, it's hard to
>>> > > enable it
>>> > > > > >> by
>>> > > > > >> > > > > default.
>>> > > > > >> > > > > > Enabling great and valuable features or
>>> improvements are
>>> > > > > useful
>>> > > > > >> > > > > > for users, especially a lot of new flink users.
>>> > > Out-of-the-box
>>> > > > > >> > > options
>>> > > > > >> > > > > > are good for users.
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > WDYT?
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > Best,
>>> > > > > >> > > > > > Rui
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > On Fri, Dec 29, 2023 at 1:45 PM Zakelly Lan <
>>> > > > > >> zakelly....@gmail.com
>>> > > > > >> > >
>>> > > > > >> > > > > wrote:
>>> > > > > >> > > > > >
>>> > > > > >> > > > > > > Hi everyone,
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > Thanks all for your comments!
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > As many of you have questions about the names for
>>> > > boolean
>>> > > > > >> > options,
>>> > > > > >> > > I
>>> > > > > >> > > > > > > suggest we make a naming rule for them. For now I
>>> > could
>>> > > > > think
>>> > > > > >> of
>>> > > > > >> > > > three
>>> > > > > >> > > > > > > options:
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > Option 1: Use enumeration options if possible. But
>>> > this
>>> > > may
>>> > > > > >> cause
>>> > > > > >> > > > some
>>> > > > > >> > > > > > name
>>> > > > > >> > > > > > > collisions or confusion as we discussed and we
>>> should
>>> > > unify
>>> > > > > >> the
>>> > > > > >> > > > > statement
>>> > > > > >> > > > > > > everywhere.
>>> > > > > >> > > > > > > Option 2: Use boolean options and add 'enabled'
>>> as the
>>> > > > > suffix.
>>> > > > > >> > > > > > > Option 3: Use boolean options and ONLY add
>>> 'enabled'
>>> > > when
>>> > > > > >> there
>>> > > > > >> > are
>>> > > > > >> > > > > more
>>> > > > > >> > > > > > > detailed configurations under the same prefix, to
>>> > > prevent
>>> > > > > one
>>> > > > > >> > name
>>> > > > > >> > > > from
>>> > > > > >> > > > > > > serving as a prefix to another.
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > I am slightly inclined to Option 3, since it is
>>> more
>>> > in
>>> > > line
>>> > > > > >> with
>>> > > > > >> > > > > current
>>> > > > > >> > > > > > > practice and friendly for existing users. Also It
>>> > > reduces
>>> > > > > the
>>> > > > > >> > > length
>>> > > > > >> > > > of
>>> > > > > >> > > > > > > configuration names as much as possible. I really
>>> want
>>> > > to
>>> > > > > hear
>>> > > > > >> > your
>>> > > > > >> > > > > > > opinions.
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > @Xuannan
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > I agree with your comments 1 and 3.
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > For 2, If we decide to change the name, maybe
>>> > > > > >> > > > > > > `execution.checkpointing.parallel-cleaner` is
>>> better?
>>> > > And as
>>> > > > > >> for
>>> > > > > >> > > > > whether
>>> > > > > >> > > > > > to
>>> > > > > >> > > > > > > add 'enabled' I suggest we discuss the rule above.
>>> > WDYT?
>>> > > > > >> > > > > > > Thanks!
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > Best,
>>> > > > > >> > > > > > > Zakelly
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > On Fri, Dec 29, 2023 at 12:02 PM Xuannan Su <
>>> > > > > >> > suxuanna...@gmail.com
>>> > > > > >> > > >
>>> > > > > >> > > > > > wrote:
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > Thanks for driving this! The organization of the
>>> > > > > >> configuration
>>> > > > > >> > > > option
>>> > > > > >> > > > > > > > in the FLIP looks much cleaner and easier to
>>> > > understand.
>>> > > > > +1
>>> > > > > >> to
>>> > > > > >> > > the
>>> > > > > >> > > > > > > > FLIP.
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > Just some questions from me.
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > 1. I think the change to the ConfigOptions
>>> should be
>>> > > put
>>> > > > > in
>>> > > > > >> the
>>> > > > > >> > > > > > > > `Public Interface` section, instead of `Proposed
>>> > > Changed`,
>>> > > > > >> as
>>> > > > > >> > > those
>>> > > > > >> > > > > > > > configuration options are public interface.
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > 2. The key
>>> `state.checkpoint.cleaner.parallel-mode`
>>> > > seems
>>> > > > > >> > > > confusing.
>>> > > > > >> > > > > > > > It feels like it is used to choose different
>>> modes.
>>> > In
>>> > > > > >> fact, it
>>> > > > > >> > > is
>>> > > > > >> > > > a
>>> > > > > >> > > > > > > > boolean flag to indicate whether to enable
>>> parallel
>>> > > clean.
>>> > > > > >> How
>>> > > > > >> > > > about
>>> > > > > >> > > > > > > > making it
>>> > > > > `state.checkpoint.cleaner.parallel-mode.enabled`?
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > 3. The `execution.checkpointing.write-buffer`
>>> may
>>> > > better
>>> > > > > be
>>> > > > > >> > > > > > > > `execution.checkpointing.write-buffer-size` so
>>> that
>>> > we
>>> > > > > know
>>> > > > > >> it
>>> > > > > >> > is
>>> > > > > >> > > > > > > > configuring the size of the buffer.
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > Best,
>>> > > > > >> > > > > > > > Xuannan
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > > > On Wed, Dec 27, 2023 at 7:17 PM Yanfei Lei <
>>> > > > > >> > fredia...@gmail.com>
>>> > > > > >> > > > > > wrote:
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > > Considering the name occupation, how about
>>> > naming
>>> > > it
>>> > > > > as
>>> > > > > >> > > > > > > > `execution.checkpointing.type`?
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > `Checkpoint Type`[1,2] is used to describe
>>> > > > > >> aligned/unaligned
>>> > > > > >> > > > > > > > > checkpoint, I am inclined to make a choice
>>> between
>>> > > > > >> > > > > > > > > `execution.checkpointing.incremental` and
>>> > > > > >> > > > > > > > > `execution.checkpointing.incremental.enabled`.
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > [1]
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > >
>>> > > > > >> > > > >
>>> > > > > >> > > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >>
>>> > > > >
>>> > >
>>> >
>>> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/monitoring/checkpoint_monitoring/
>>> > > > > >> > > > > > > > > [2]
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > >
>>> > > > > >> > > > >
>>> > > > > >> > > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >>
>>> > > > >
>>> > >
>>> >
>>> https://github.com/apache/flink/blob/master/flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.html#L27
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > --
>>> > > > > >> > > > > > > > > Best,
>>> > > > > >> > > > > > > > > Yanfei
>>> > > > > >> > > > > > > > >
>>> > > > > >> > > > > > > > > Zakelly Lan <zakelly....@gmail.com>
>>> > 于2023年12月27日周三
>>> > > > > >> 14:41写道:
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > Hi Lijie,
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > Thanks for the reminder! I missed this.
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > Considering the name occupation, how about
>>> > naming
>>> > > it
>>> > > > > as
>>> > > > > >> > > > > > > > > > `execution.checkpointing.type`?
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > Actually I think the current
>>> > > > > >> `execution.checkpointing.mode`
>>> > > > > >> > > is
>>> > > > > >> > > > > > > > confusing in
>>> > > > > >> > > > > > > > > > some ways, maybe
>>> > > > > >> `execution.checkpointing.data-consistency`
>>> > > > > >> > > is
>>> > > > > >> > > > > > > better.
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > Zakelly
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > On Wed, Dec 27, 2023 at 12:59 PM Lijie Wang
>>> <
>>> > > > > >> > > > > > > wangdachui9...@gmail.com>
>>> > > > > >> > > > > > > > > > wrote:
>>> > > > > >> > > > > > > > > >
>>> > > > > >> > > > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >> I'm wondering if
>>> > > > > >> > `execution.checkpointing.savepoint-dir`
>>> > > > > >> > > > > would
>>> > > > > >> > > > > > > be
>>> > > > > >> > > > > > > > > > > better.
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > `execution.checkpointing.dir` and
>>> > > > > >> > > > > > > > `execution.checkpointing.savepoint-dir`
>>> > > > > >> > > > > > > > > > > are also fine for me.
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >> So I think an enumeration option
>>> > > > > >> > > > > > `execution.checkpointing.mode`
>>> > > > > >> > > > > > > > which
>>> > > > > >> > > > > > > > > > > can be 'full' (default) or 'incremental'
>>> would
>>> > > be
>>> > > > > >> better
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > I agree with using an enumeration option.
>>> But
>>> > > > > >> currently
>>> > > > > >> > > there
>>> > > > > >> > > > > is
>>> > > > > >> > > > > > > > already a
>>> > > > > >> > > > > > > > > > > configuration option called
>>> > > > > >> > `execution.checkpointing.mode`,
>>> > > > > >> > > > > which
>>> > > > > >> > > > > > > is
>>> > > > > >> > > > > > > > used
>>> > > > > >> > > > > > > > > > > to choose EXACTLY_ONCE or AT_LEAST_ONCE.
>>> Maybe
>>> > > we
>>> > > > > >> need to
>>> > > > > >> > > use
>>> > > > > >> > > > > > > > another name
>>> > > > > >> > > > > > > > > > > or merge these two options.
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > Lijie
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > Zakelly Lan <zakelly....@gmail.com>
>>> > > 于2023年12月27日周三
>>> > > > > >> > > 11:43写道:
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > Hi everyone,
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > Thanks all for your comments!
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > @Yanfei
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > 1. For some state backends that do not
>>> > > support
>>> > > > > >> > > > incremental
>>> > > > > >> > > > > > > > checkpoint,
>>> > > > > >> > > > > > > > > > > > > how does the
>>> > > > > >> > execution.checkpointing.incrementaloption
>>> > > > > >> > > > take
>>> > > > > >> > > > > > > > effect? Or
>>> > > > > >> > > > > > > > > > > > > is it better to put incremental under
>>> > > > > >> > > > > > > > state.backend.xxx.incremental?
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > I'd rather not put the option for
>>> > incremental
>>> > > > > >> > checkpoint
>>> > > > > >> > > > > under
>>> > > > > >> > > > > > > the
>>> > > > > >> > > > > > > > > > > > 'state.backend', since it is more about
>>> the
>>> > > > > >> > checkpointing
>>> > > > > >> > > > > > instead
>>> > > > > >> > > > > > > > of
>>> > > > > >> > > > > > > > > > > state
>>> > > > > >> > > > > > > > > > > > accessing. Of course, the state backend
>>> may
>>> > > not
>>> > > > > >> > > necessarily
>>> > > > > >> > > > > do
>>> > > > > >> > > > > > > > > > > incremental
>>> > > > > >> > > > > > > > > > > > checkpoint as requested. If the state
>>> > backend
>>> > > is
>>> > > > > not
>>> > > > > >> > > > capable
>>> > > > > >> > > > > of
>>> > > > > >> > > > > > > > taking
>>> > > > > >> > > > > > > > > > > > incremental cp, it is better to
>>> fallback to
>>> > > the
>>> > > > > full
>>> > > > > >> > cp.
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > 2. I'm a little worried that putting all
>>> > > > > >> configurations
>>> > > > > >> > > > into
>>> > > > > >> > > > > > > > > > > > > `ExecutionCheckpointingOptions` will
>>> > > introduce
>>> > > > > >> some
>>> > > > > >> > > > > > dependency
>>> > > > > >> > > > > > > > > > > > > problems. Some options would be used
>>> by
>>> > > > > >> flink-runtime
>>> > > > > >> > > > > module,
>>> > > > > >> > > > > > > but
>>> > > > > >> > > > > > > > > > > > > flink-runtime should not depend on
>>> > > > > >> > > flink-streaming-java.
>>> > > > > >> > > > > e.g.
>>> > > > > >> > > > > > > > > > > > > FLINK-28286[1].
>>> > > > > >> > > > > > > > > > > > > So, I prefer to move configurations to
>>> > > > > >> > > > > > `CheckpointingOptions`,
>>> > > > > >> > > > > > > > WDYT?
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > Yes, that's a very good point.  Moving
>>> to
>>> > > > > >> > > > > > > > > > > > `CheckpointingOptions`(flink-core) makes
>>> > > sense.
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > @Lijie
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > How about
>>> > > > > >> > > > > > > > > > > > > state.savepoints.dir ->
>>> > > > > >> > > > > execution.checkpointing.savepoint.dir
>>> > > > > >> > > > > > > > > > > > > state.checkpoints.dir ->
>>> > > > > >> > > > > > execution.checkpointing.checkpoint.dir
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > Actually, I think the
>>> > > `checkpointing.checkpoint`
>>> > > > > may
>>> > > > > >> > > cause
>>> > > > > >> > > > > some
>>> > > > > >> > > > > > > > > > > confusion.
>>> > > > > >> > > > > > > > > > > > But I'm ok if others agree.
>>> > > > > >> > > > > > > > > > > > I'm wondering if
>>> > > > > >> > `execution.checkpointing.savepoint-dir`
>>> > > > > >> > > > > would
>>> > > > > >> > > > > > be
>>> > > > > >> > > > > > > > better.
>>> > > > > >> > > > > > > > > > > > WDYT?
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > 2. We changed the
>>> > > > > >> execution.checkpointing.local-copy'
>>> > > > > >> > to
>>> > > > > >> > > > > > > > > > > > >
>>> > > 'execution.checkpointing.local-copy.enabled'.
>>> > > > > >> Should
>>> > > > > >> > we
>>> > > > > >> > > > > also
>>> > > > > >> > > > > > > add
>>> > > > > >> > > > > > > > > > > > "enabled"
>>> > > > > >> > > > > > > > > > > > > suffix for other boolean type
>>> > configuration
>>> > > > > >> options ?
>>> > > > > >> > > For
>>> > > > > >> > > > > > > > example,
>>> > > > > >> > > > > > > > > > > > > execution.checkpointing.incremental ->
>>> > > > > >> > > > > > > > > > > > >
>>> > execution.checkpointing.incremental.enabled
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > Actually, the incremental cp is
>>> something
>>> > like
>>> > > > > >> > choosing a
>>> > > > > >> > > > > mode
>>> > > > > >> > > > > > > for
>>> > > > > >> > > > > > > > doing
>>> > > > > >> > > > > > > > > > > > checkpoint instead of enabling a
>>> function.
>>> > So
>>> > > I
>>> > > > > >> think
>>> > > > > >> > an
>>> > > > > >> > > > > > > > enumeration
>>> > > > > >> > > > > > > > > > > option
>>> > > > > >> > > > > > > > > > > > `execution.checkpointing.mode` which
>>> can be
>>> > > 'full'
>>> > > > > >> > > > (default)
>>> > > > > >> > > > > or
>>> > > > > >> > > > > > > > > > > > 'incremental' would be better, WDYT?
>>> > > > > >> > > > > > > > > > > > And @Rui Fan @Yanfei What do you think
>>> about
>>> > > this?
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > On Tue, Dec 26, 2023 at 5:15 PM Lijie
>>> Wang <
>>> > > > > >> > > > > > > > wangdachui9...@gmail.com>
>>> > > > > >> > > > > > > > > > > > wrote:
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > Thanks for driving the discussion.
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > 1.
>>> > > > > >> > > > > > > > > > > > > >> But I'm not so sure since there is
>>> only
>>> > > one
>>> > > > > >> > > > > > > savepoint-related
>>> > > > > >> > > > > > > > > > > option.
>>> > > > > >> > > > > > > > > > > > > Maybe someone else could share some
>>> > thoughts
>>> > > > > here.
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > How about
>>> > > > > >> > > > > > > > > > > > > state.savepoints.dir ->
>>> > > > > >> > > > > execution.checkpointing.savepoint.dir
>>> > > > > >> > > > > > > > > > > > > state.checkpoints.dir ->
>>> > > > > >> > > > > > execution.checkpointing.checkpoint.dir
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > 2. We changed the
>>> > > > > >> execution.checkpointing.local-copy'
>>> > > > > >> > > to
>>> > > > > >> > > > > > > > > > > > >
>>> > > 'execution.checkpointing.local-copy.enabled'.
>>> > > > > >> Should
>>> > > > > >> > we
>>> > > > > >> > > > > also
>>> > > > > >> > > > > > > add
>>> > > > > >> > > > > > > > > > > > "enabled"
>>> > > > > >> > > > > > > > > > > > > suffix for other boolean type
>>> > configuration
>>> > > > > >> options ?
>>> > > > > >> > > For
>>> > > > > >> > > > > > > > example,
>>> > > > > >> > > > > > > > > > > > > execution.checkpointing.incremental ->
>>> > > > > >> > > > > > > > > > > > >
>>> > execution.checkpointing.incremental.enabled
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > In this way, the naming style of
>>> > > configuration
>>> > > > > >> > options
>>> > > > > >> > > is
>>> > > > > >> > > > > > > > unified, and
>>> > > > > >> > > > > > > > > > > it
>>> > > > > >> > > > > > > > > > > > > can avoid potential similar problems
>>> (for
>>> > > > > >> example, we
>>> > > > > >> > > may
>>> > > > > >> > > > > > need
>>> > > > > >> > > > > > > > to add
>>> > > > > >> > > > > > > > > > > > more
>>> > > > > >> > > > > > > > > > > > > options for incremental checkpoint in
>>> the
>>> > > > > future).
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > > > Lijie
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > Yanfei Lei <fredia...@gmail.com>
>>> > > 于2023年12月26日周二
>>> > > > > >> > > 12:05写道:
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > Thank you for creating the FLIP and
>>> > > starting
>>> > > > > the
>>> > > > > >> > > > > > discussion.
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > The current arrangement of these
>>> options
>>> > > is
>>> > > > > >> indeed
>>> > > > > >> > > > > somewhat
>>> > > > > >> > > > > > > > > > > haphazard,
>>> > > > > >> > > > > > > > > > > > > > and the new arrangement looks much
>>> > > better. I
>>> > > > > >> have
>>> > > > > >> > > some
>>> > > > > >> > > > > > > > questions
>>> > > > > >> > > > > > > > > > > about
>>> > > > > >> > > > > > > > > > > > > > the arrangement of some new
>>> > configuration
>>> > > > > >> options:
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > 1. For some state backends that do
>>> not
>>> > > support
>>> > > > > >> > > > > incremental
>>> > > > > >> > > > > > > > > > > checkpoint,
>>> > > > > >> > > > > > > > > > > > > > how does the
>>> > > > > >> > > execution.checkpointing.incrementaloption
>>> > > > > >> > > > > take
>>> > > > > >> > > > > > > > effect?
>>> > > > > >> > > > > > > > > > > Or
>>> > > > > >> > > > > > > > > > > > > > is it better to put incremental
>>> under
>>> > > > > >> > > > > > > > state.backend.xxx.incremental?
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > 2. I'm a little worried that
>>> putting all
>>> > > > > >> > > configurations
>>> > > > > >> > > > > > into
>>> > > > > >> > > > > > > > > > > > > > `ExecutionCheckpointingOptions` will
>>> > > introduce
>>> > > > > >> some
>>> > > > > >> > > > > > > dependency
>>> > > > > >> > > > > > > > > > > > > > problems. Some options would be
>>> used by
>>> > > > > >> > flink-runtime
>>> > > > > >> > > > > > module,
>>> > > > > >> > > > > > > > but
>>> > > > > >> > > > > > > > > > > > > > flink-runtime should not depend on
>>> > > > > >> > > > flink-streaming-java.
>>> > > > > >> > > > > > e.g.
>>> > > > > >> > > > > > > > > > > > > > FLINK-28286[1].
>>> > > > > >> > > > > > > > > > > > > > So, I prefer to move configurations
>>> to
>>> > > > > >> > > > > > > `CheckpointingOptions`,
>>> > > > > >> > > > > > > > WDYT?
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > [1]
>>> > > > > >> > > https://issues.apache.org/jira/browse/FLINK-28286
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > --
>>> > > > > >> > > > > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > > > > Yanfei
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > Zakelly Lan <zakelly....@gmail.com>
>>> > > > > >> 于2023年12月25日周一
>>> > > > > >> > > > > > 21:14写道:
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Hi Rui Fan and Junrui,
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Thanks for the reminder! I agree
>>> to
>>> > > change
>>> > > > > the
>>> > > > > >> > > > > > > > > > > > > > >
>>> 'execution.checkpointing.local-copy'
>>> > to
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > 'execution.checkpointing.local-copy.enabled'.
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > And for other suggestions Rui
>>> > proposed:
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > 1. How about
>>> > > > > >> execution.checkpointing.storage.type
>>> > > > > >> > > > > instead
>>> > > > > >> > > > > > > > > > > > > > > > of
>>> execution.checkpointing.storage?
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Ah, I missed something here.
>>> Actually
>>> > I
>>> > > > > >> suggest
>>> > > > > >> > we
>>> > > > > >> > > > > could
>>> > > > > >> > > > > > > > merge the
>>> > > > > >> > > > > > > > > > > > > > current
>>> > > > > >> > > > > > > > > > > > > > > 'state.checkpoints.dir' and
>>> > > > > >> > > > 'state.checkpoint-storage'
>>> > > > > >> > > > > > into
>>> > > > > >> > > > > > > > one URI
>>> > > > > >> > > > > > > > > > > > > > > configuration named
>>> > > > > >> > 'execution.checkpointing.dir'.
>>> > > > > >> > > > > WDYT?
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > 3.
>>> > > execution.checkpointing.savepoint.dir is
>>> > > > > a
>>> > > > > >> > > little
>>> > > > > >> > > > > > weird.
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Yes, I think it is better to make
>>> > > > > 'savepoint'
>>> > > > > >> and
>>> > > > > >> > > > > > > > 'checkpoint' the
>>> > > > > >> > > > > > > > > > > > same
>>> > > > > >> > > > > > > > > > > > > > > level. But I'm not so sure since
>>> there
>>> > > is
>>> > > > > only
>>> > > > > >> > one
>>> > > > > >> > > > > > > > > > > savepoint-related
>>> > > > > >> > > > > > > > > > > > > > > option. Maybe someone else could
>>> share
>>> > > some
>>> > > > > >> > > thoughts
>>> > > > > >> > > > > > here.
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > 4. How about
>>> > > execution.recovery.claim-mode
>>> > > > > >> > instead
>>> > > > > >> > > of
>>> > > > > >> > > > > > > > > > > > > > > > execution.recovery.mode?
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > >  Agreed. That's more accurate.
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Many thanks for your suggestions!
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > > > > > Zakelly
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > On Mon, Dec 25, 2023 at 8:18 PM
>>> Junrui
>>> > > Lee <
>>> > > > > >> > > > > > > > jrlee....@gmail.com>
>>> > > > > >> > > > > > > > > > > > > wrote:
>>> > > > > >> > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > Thanks for driving this. I agree
>>> > that
>>> > > the
>>> > > > > >> > > proposed
>>> > > > > >> > > > > > > > restructuring
>>> > > > > >> > > > > > > > > > > of
>>> > > > > >> > > > > > > > > > > > > the
>>> > > > > >> > > > > > > > > > > > > > > > configuration options is largely
>>> > > positive.
>>> > > > > >> It
>>> > > > > >> > > will
>>> > > > > >> > > > > make
>>> > > > > >> > > > > > > > > > > > understanding
>>> > > > > >> > > > > > > > > > > > > > and
>>> > > > > >> > > > > > > > > > > > > > > > working with Flink
>>> configurations
>>> > more
>>> > > > > >> > intuitive.
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > Most of the proposed changes
>>> look
>>> > > great.
>>> > > > > >> Just a
>>> > > > > >> > > > > > heads-up,
>>> > > > > >> > > > > > > > as Rui
>>> > > > > >> > > > > > > > > > > > Fan
>>> > > > > >> > > > > > > > > > > > > > > > mentioned, Flink currently
>>> requires
>>> > > that
>>> > > > > no
>>> > > > > >> > > > > > > configOption's
>>> > > > > >> > > > > > > > key be
>>> > > > > >> > > > > > > > > > > > the
>>> > > > > >> > > > > > > > > > > > > > > > prefix of another to avoid
>>> issues
>>> > > when we
>>> > > > > >> > > > eventually
>>> > > > > >> > > > > > > adopt
>>> > > > > >> > > > > > > > a
>>> > > > > >> > > > > > > > > > > > standard
>>> > > > > >> > > > > > > > > > > > > > YAML
>>> > > > > >> > > > > > > > > > > > > > > > parser, as detailed in
>>> FLINK-29372 (
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > https://issues.apache.org/jira/browse/FLINK-29372
>>> > > > > >> > > > ).
>>> > > > > >> > > > > > > > Therefore,
>>> > > > > >> > > > > > > > > > > > it's
>>> > > > > >> > > > > > > > > > > > > > better
>>> > > > > >> > > > > > > > > > > > > > > > to change the key
>>> > > > > >> > > > > 'execution.checkpointing.local-copy'
>>> > > > > >> > > > > > > > because it
>>> > > > > >> > > > > > > > > > > > > > serves as
>>> > > > > >> > > > > > > > > > > > > > > > a prefix to the key
>>> > > > > >> > > > > > > > 'execution.checkpointing.local-copy.dir'.
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > Best regards,
>>> > > > > >> > > > > > > > > > > > > > > > Junrui
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > Rui Fan <1996fan...@gmail.com>
>>> > > > > >> 于2023年12月25日周一
>>> > > > > >> > > > > 19:11写道:
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > Hi Zakelly,
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > Thank you for driving this
>>> > proposal!
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > Overall good for me. I have
>>> some
>>> > > > > questions
>>> > > > > >> > > about
>>> > > > > >> > > > > > these
>>> > > > > >> > > > > > > > names.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > 1. How about
>>> > > > > >> > > execution.checkpointing.storage.type
>>> > > > > >> > > > > > > > instead of
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> execution.checkpointing.storage?
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > It's similar to
>>> > state.backend.type.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > 2. How about
>>> > > > > >> > > > > > execution.checkpointing.local-copy.enabled
>>> > > > > >> > > > > > > > instead
>>> > > > > >> > > > > > > > > > > > of
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > execution.checkpointing.local-copy?
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > You added a new option:
>>> > > > > >> > > > > > > > execution.checkpointing.local-copy.dir.
>>> > > > > >> > > > > > > > > > > > > > > > > IIUC, one option name
>>> shouldn't be
>>> > > the
>>> > > > > >> prefix
>>> > > > > >> > > of
>>> > > > > >> > > > > > other
>>> > > > > >> > > > > > > > options.
>>> > > > > >> > > > > > > > > > > > > > > > > If you add a new option
>>> > > > > >> > > > > > > > execution.checkpointing.local-copy,
>>> > > > > >> > > > > > > > > > > > > > > > > flink CI will fail directly.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > 3.
>>> > > execution.checkpointing.savepoint.dir
>>> > > > > >> is a
>>> > > > > >> > > > > little
>>> > > > > >> > > > > > > > weird.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > For old options:
>>> > > state.savepoints.dir
>>> > > > > and
>>> > > > > >> > > > > > > > > > > state.checkpoints.dir,
>>> > > > > >> > > > > > > > > > > > > > > > > the savepoint and checkpoint
>>> are
>>> > the
>>> > > > > same
>>> > > > > >> > > level.
>>> > > > > >> > > > It
>>> > > > > >> > > > > > > means
>>> > > > > >> > > > > > > > > > > > > > > > > it's a checkpoint or
>>> savepoint.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > The new option
>>> > > > > >> execution.checkpointing.dir is
>>> > > > > >> > > > fine
>>> > > > > >> > > > > > for
>>> > > > > >> > > > > > > > me.
>>> > > > > >> > > > > > > > > > > > > > > > > However,
>>> > > > > >> > execution.checkpointing.savepoint.dir
>>> > > > > >> > > > is a
>>> > > > > >> > > > > > > > little
>>> > > > > >> > > > > > > > > > > weird.
>>> > > > > >> > > > > > > > > > > > > > > > > I don't know which name is
>>> better
>>> > > now.
>>> > > > > >> Let us
>>> > > > > >> > > > think
>>> > > > > >> > > > > > > > about it
>>> > > > > >> > > > > > > > > > > > more.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > 4. How about
>>> > > > > execution.recovery.claim-mode
>>> > > > > >> > > > instead
>>> > > > > >> > > > > of
>>> > > > > >> > > > > > > > > > > > > > > > > execution.recovery.mode?
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > The meaning of mode is too
>>> broad.
>>> > > The
>>> > > > > >> > > claim-mode
>>> > > > > >> > > > > may
>>> > > > > >> > > > > > > > > > > > > > > > > be more accurate for users.
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > WDYT?
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > > > > > > > Rui
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > On Mon, Dec 25, 2023 at
>>> 5:14 PM
>>> > > Zakelly
>>> > > > > >> Lan <
>>> > > > > >> > > > > > > > > > > > zakelly....@gmail.com
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > wrote:
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > Hi devs,
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > I'd like to start a
>>> discussion
>>> > on
>>> > > > > >> FLIP-406:
>>> > > > > >> > > > > > > Reorganize
>>> > > > > >> > > > > > > > State
>>> > > > > >> > > > > > > > > > > &
>>> > > > > >> > > > > > > > > > > > > > > > > > Checkpointing & Recovery
>>> > > > > >> Configuration[1].
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > Currently, the configuration
>>> > > options
>>> > > > > >> > > pertaining
>>> > > > > >> > > > > to
>>> > > > > >> > > > > > > > > > > > checkpointing,
>>> > > > > >> > > > > > > > > > > > > > > > > recovery,
>>> > > > > >> > > > > > > > > > > > > > > > > > and state management are
>>> > primarily
>>> > > > > >> grouped
>>> > > > > >> > > > under
>>> > > > > >> > > > > > the
>>> > > > > >> > > > > > > > > > > following
>>> > > > > >> > > > > > > > > > > > > > > > prefixes:
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > >    - state.backend.* :
>>> > > configurations
>>> > > > > >> > related
>>> > > > > >> > > > to
>>> > > > > >> > > > > > > state
>>> > > > > >> > > > > > > > > > > > accessing
>>> > > > > >> > > > > > > > > > > > > > and
>>> > > > > >> > > > > > > > > > > > > > > > > >    checkpointing, as well as
>>> > > specific
>>> > > > > >> > options
>>> > > > > >> > > > for
>>> > > > > >> > > > > > > > individual
>>> > > > > >> > > > > > > > > > > > > state
>>> > > > > >> > > > > > > > > > > > > > > > > backends
>>> > > > > >> > > > > > > > > > > > > > > > > >    -
>>> execution.checkpointing.* :
>>> > > > > >> > > configurations
>>> > > > > >> > > > > > > > associated
>>> > > > > >> > > > > > > > > > > with
>>> > > > > >> > > > > > > > > > > > > > > > > checkpoint
>>> > > > > >> > > > > > > > > > > > > > > > > >    execution and recovery
>>> > > > > >> > > > > > > > > > > > > > > > > >    - execution.savepoint.*:
>>> > > > > >> configurations
>>> > > > > >> > > for
>>> > > > > >> > > > > > > > recovery from
>>> > > > > >> > > > > > > > > > > > > > savepoint
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > In addition, there are
>>> several
>>> > > > > >> individual
>>> > > > > >> > > > options
>>> > > > > >> > > > > > > such
>>> > > > > >> > > > > > > > as '
>>> > > > > >> > > > > > > > > > > > > > > > > > *state.checkpoint-storage*'
>>> and
>>> > > > > >> > > > > > > > '*state.checkpoints.dir*'
>>> > > > > >> > > > > > > > > > > that
>>> > > > > >> > > > > > > > > > > > > fall
>>> > > > > >> > > > > > > > > > > > > > > > > outside
>>> > > > > >> > > > > > > > > > > > > > > > > > of these prefixes. The
>>> current
>>> > > > > >> arrangement
>>> > > > > >> > of
>>> > > > > >> > > > > these
>>> > > > > >> > > > > > > > options,
>>> > > > > >> > > > > > > > > > > > > which
>>> > > > > >> > > > > > > > > > > > > > span
>>> > > > > >> > > > > > > > > > > > > > > > > > multiple modules, is
>>> somewhat
>>> > > > > haphazard
>>> > > > > >> and
>>> > > > > >> > > > > lacks a
>>> > > > > >> > > > > > > > > > > systematic
>>> > > > > >> > > > > > > > > > > > > > > > structure.
>>> > > > > >> > > > > > > > > > > > > > > > > > For example, the options
>>> under
>>> > the
>>> > > > > >> > > > > > > > '*CheckpointingOptions*'
>>> > > > > >> > > > > > > > > > > > and '
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> *ExecutionCheckpointingOptions*'
>>> > > are
>>> > > > > >> > related
>>> > > > > >> > > > and
>>> > > > > >> > > > > > have
>>> > > > > >> > > > > > > > no
>>> > > > > >> > > > > > > > > > > clear
>>> > > > > >> > > > > > > > > > > > > > > > boundaries
>>> > > > > >> > > > > > > > > > > > > > > > > > from the user's
>>> perspective, but
>>> > > there
>>> > > > > >> is
>>> > > > > >> > no
>>> > > > > >> > > > > > unified
>>> > > > > >> > > > > > > > prefix
>>> > > > > >> > > > > > > > > > > for
>>> > > > > >> > > > > > > > > > > > > > them.
>>> > > > > >> > > > > > > > > > > > > > > > > With
>>> > > > > >> > > > > > > > > > > > > > > > > > the upcoming release of
>>> Flink
>>> > > 2.0, we
>>> > > > > >> have
>>> > > > > >> > an
>>> > > > > >> > > > > > > excellent
>>> > > > > >> > > > > > > > > > > > > > opportunity to
>>> > > > > >> > > > > > > > > > > > > > > > > > overhaul and restructure the
>>> > > > > >> configurations
>>> > > > > >> > > > > related
>>> > > > > >> > > > > > > to
>>> > > > > >> > > > > > > > > > > > > > checkpointing,
>>> > > > > >> > > > > > > > > > > > > > > > > > recovery, and state
>>> management.
>>> > > This
>>> > > > > >> FLIP
>>> > > > > >> > > > > proposes
>>> > > > > >> > > > > > to
>>> > > > > >> > > > > > > > > > > > reorganize
>>> > > > > >> > > > > > > > > > > > > > these
>>> > > > > >> > > > > > > > > > > > > > > > > > settings, making it more
>>> > coherent
>>> > > by
>>> > > > > >> > module,
>>> > > > > >> > > > > which
>>> > > > > >> > > > > > > > would
>>> > > > > >> > > > > > > > > > > > > > significantly
>>> > > > > >> > > > > > > > > > > > > > > > > > lower the barriers for
>>> > > understanding
>>> > > > > and
>>> > > > > >> > > reduce
>>> > > > > >> > > > > the
>>> > > > > >> > > > > > > > > > > development
>>> > > > > >> > > > > > > > > > > > > > costs
>>> > > > > >> > > > > > > > > > > > > > > > > > moving forward.
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > Looking forward to hearing
>>> from
>>> > > you!
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > [1]
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > >
>>> > > > > >> > > > >
>>> > > > > >> > > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >>
>>> > > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > > > Best,
>>> > > > > >> > > > > > > > > > > > > > > > > > Zakelly
>>> > > > > >> > > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > > >
>>> > > > > >> > > > > > > > > > >
>>> > > > > >> > > > > > > >
>>> > > > > >> > > > > > >
>>> > > > > >> > > > > >
>>> > > > > >> > > > >
>>> > > > > >> > > >
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >>
>>> > > > > >>
>>> > > > > >> --
>>> > > > > >> Best,
>>> > > > > >> Hangxiang.
>>> > > > > >>
>>> > > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Best,
>>> > > > Hangxiang.
>>> > >
>>> > >
>>> >
>>>
>>

Reply via email to