Hi Zakelly,

> Considering the name occupation, how about naming it as 
> `execution.checkpointing.type`?

`Checkpoint Type`[1,2] is used to describe aligned/unaligned
checkpoint, I am inclined to make a choice between
`execution.checkpointing.incremental` and
`execution.checkpointing.incremental.enabled`.


[1] 
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/monitoring/checkpoint_monitoring/
[2] 
https://github.com/apache/flink/blob/master/flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.html#L27

--
Best,
Yanfei

Zakelly Lan <zakelly....@gmail.com> 于2023年12月27日周三 14:41写道:
>
> Hi Lijie,
>
> Thanks for the reminder! I missed this.
>
> Considering the name occupation, how about naming it as
> `execution.checkpointing.type`?
>
> Actually I think the current `execution.checkpointing.mode` is confusing in
> some ways, maybe `execution.checkpointing.data-consistency` is better.
>
>
> Best,
> Zakelly
>
>
> On Wed, Dec 27, 2023 at 12:59 PM Lijie Wang <wangdachui9...@gmail.com>
> wrote:
>
> > Hi Zakelly,
> >
> > >> I'm wondering if `execution.checkpointing.savepoint-dir` would be
> > better.
> >
> > `execution.checkpointing.dir` and `execution.checkpointing.savepoint-dir`
> > are also fine for me.
> >
> > >> So I think an enumeration option `execution.checkpointing.mode` which
> > can be 'full' (default) or 'incremental' would be better
> >
> > I agree with using an enumeration option. But currently there is already a
> > configuration option called `execution.checkpointing.mode`, which is used
> > to choose EXACTLY_ONCE or AT_LEAST_ONCE. Maybe we need to use another name
> > or merge these two options.
> >
> > Best,
> > Lijie
> >
> > Zakelly Lan <zakelly....@gmail.com> 于2023年12月27日周三 11:43写道:
> >
> > > Hi everyone,
> > >
> > > Thanks all for your comments!
> > >
> > > @Yanfei
> > >
> > > > 1. For some state backends that do not support incremental checkpoint,
> > > > how does the execution.checkpointing.incrementaloption take effect? Or
> > > > is it better to put incremental under state.backend.xxx.incremental?
> > > >
> > > I'd rather not put the option for incremental checkpoint under the
> > > 'state.backend', since it is more about the checkpointing instead of
> > state
> > > accessing. Of course, the state backend may not necessarily do
> > incremental
> > > checkpoint as requested. If the state backend is not capable of taking
> > > incremental cp, it is better to fallback to the full cp.
> > >
> > > 2. I'm a little worried that putting all configurations into
> > > > `ExecutionCheckpointingOptions` will introduce some dependency
> > > > problems. Some options would be used by flink-runtime module, but
> > > > flink-runtime should not depend on flink-streaming-java. e.g.
> > > > FLINK-28286[1].
> > > > So, I prefer to move configurations to `CheckpointingOptions`, WDYT?
> > > >
> > >
> > > Yes, that's a very good point.  Moving to
> > > `CheckpointingOptions`(flink-core) makes sense.
> > >
> > > @Lijie
> > >
> > > How about
> > > > state.savepoints.dir -> execution.checkpointing.savepoint.dir
> > > > state.checkpoints.dir -> execution.checkpointing.checkpoint.dir
> > >
> > >
> > > Actually, I think the `checkpointing.checkpoint` may cause some
> > confusion.
> > > But I'm ok if others agree.
> > > I'm wondering if `execution.checkpointing.savepoint-dir` would be better.
> > > WDYT?
> > >
> > > 2. We changed the execution.checkpointing.local-copy' to
> > > > 'execution.checkpointing.local-copy.enabled'. Should we also add
> > > "enabled"
> > > > suffix for other boolean type configuration options ? For example,
> > > > execution.checkpointing.incremental ->
> > > > execution.checkpointing.incremental.enabled
> > > >
> > >
> > > Actually, the incremental cp is something like choosing a mode for doing
> > > checkpoint instead of enabling a function. So I think an enumeration
> > option
> > > `execution.checkpointing.mode` which can be 'full' (default) or
> > > 'incremental' would be better, WDYT?
> > > And @Rui Fan @Yanfei What do you think about this?
> > >
> > >
> > > On Tue, Dec 26, 2023 at 5:15 PM Lijie Wang <wangdachui9...@gmail.com>
> > > wrote:
> > >
> > > > Hi Zakelly,
> > > >
> > > > Thanks for driving the discussion.
> > > >
> > > > 1.
> > > > >> But I'm not so sure since there is only one savepoint-related
> > option.
> > > > Maybe someone else could share some thoughts here.
> > > >
> > > > How about
> > > > state.savepoints.dir -> execution.checkpointing.savepoint.dir
> > > > state.checkpoints.dir -> execution.checkpointing.checkpoint.dir
> > > >
> > > > 2. We changed the execution.checkpointing.local-copy' to
> > > > 'execution.checkpointing.local-copy.enabled'. Should we also add
> > > "enabled"
> > > > suffix for other boolean type configuration options ? For example,
> > > > execution.checkpointing.incremental ->
> > > > execution.checkpointing.incremental.enabled
> > > >
> > > > In this way, the naming style of configuration options is unified, and
> > it
> > > > can avoid potential similar problems (for example, we may need to add
> > > more
> > > > options for incremental checkpoint in the future).
> > > >
> > > > Best,
> > > > Lijie
> > > >
> > > > Yanfei Lei <fredia...@gmail.com> 于2023年12月26日周二 12:05写道:
> > > >
> > > > > Hi Zakelly,
> > > > >
> > > > > Thank you for creating the FLIP and starting the discussion.
> > > > >
> > > > > The current arrangement of these options is indeed somewhat
> > haphazard,
> > > > > and the new arrangement looks much better. I have some questions
> > about
> > > > > the arrangement of some new configuration options:
> > > > >
> > > > > 1. For some state backends that do not support incremental
> > checkpoint,
> > > > > how does the execution.checkpointing.incrementaloption take effect?
> > Or
> > > > > is it better to put incremental under state.backend.xxx.incremental?
> > > > >
> > > > > 2. I'm a little worried that putting all configurations into
> > > > > `ExecutionCheckpointingOptions` will introduce some dependency
> > > > > problems. Some options would be used by flink-runtime module, but
> > > > > flink-runtime should not depend on flink-streaming-java. e.g.
> > > > > FLINK-28286[1].
> > > > > So, I prefer to move configurations to `CheckpointingOptions`, WDYT?
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-28286
> > > > >
> > > > > --
> > > > > Best,
> > > > > Yanfei
> > > > >
> > > > > Zakelly Lan <zakelly....@gmail.com> 于2023年12月25日周一 21:14写道:
> > > > > >
> > > > > > Hi Rui Fan and Junrui,
> > > > > >
> > > > > > Thanks for the reminder! I agree to change the
> > > > > > 'execution.checkpointing.local-copy' to
> > > > > > 'execution.checkpointing.local-copy.enabled'.
> > > > > >
> > > > > > And for other suggestions Rui proposed:
> > > > > >
> > > > > > 1. How about execution.checkpointing.storage.type instead
> > > > > > > of execution.checkpointing.storage?
> > > > > >
> > > > > >
> > > > > > Ah, I missed something here. Actually I suggest we could merge the
> > > > > current
> > > > > > 'state.checkpoints.dir' and 'state.checkpoint-storage' into one URI
> > > > > > configuration named 'execution.checkpointing.dir'. WDYT?
> > > > > >
> > > > > > 3. execution.checkpointing.savepoint.dir is a little weird.
> > > > > > >
> > > > > >
> > > > > > Yes, I think it is better to make 'savepoint' and 'checkpoint' the
> > > same
> > > > > > level. But I'm not so sure since there is only one
> > savepoint-related
> > > > > > option. Maybe someone else could share some thoughts here.
> > > > > >
> > > > > > 4. How about execution.recovery.claim-mode instead of
> > > > > > > execution.recovery.mode?
> > > > > > >
> > > > > >
> > > > > >  Agreed. That's more accurate.
> > > > > >
> > > > > >
> > > > > > Many thanks for your suggestions!
> > > > > >
> > > > > > Best,
> > > > > > Zakelly
> > > > > >
> > > > > > On Mon, Dec 25, 2023 at 8:18 PM Junrui Lee <jrlee....@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Hi Zakelly,
> > > > > > >
> > > > > > > Thanks for driving this. I agree that the proposed restructuring
> > of
> > > > the
> > > > > > > configuration options is largely positive. It will make
> > > understanding
> > > > > and
> > > > > > > working with Flink configurations more intuitive.
> > > > > > >
> > > > > > > Most of the proposed changes look great. Just a heads-up, as Rui
> > > Fan
> > > > > > > mentioned, Flink currently requires that no configOption's key be
> > > the
> > > > > > > prefix of another to avoid issues when we eventually adopt a
> > > standard
> > > > > YAML
> > > > > > > parser, as detailed in FLINK-29372 (
> > > > > > > https://issues.apache.org/jira/browse/FLINK-29372). Therefore,
> > > it's
> > > > > better
> > > > > > > to change the key 'execution.checkpointing.local-copy' because it
> > > > > serves as
> > > > > > > a prefix to the key 'execution.checkpointing.local-copy.dir'.
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Junrui
> > > > > > >
> > > > > > > Rui Fan <1996fan...@gmail.com> 于2023年12月25日周一 19:11写道:
> > > > > > >
> > > > > > > > Hi Zakelly,
> > > > > > > >
> > > > > > > > Thank you for driving this proposal!
> > > > > > > >
> > > > > > > > Overall good for me. I have some questions about these names.
> > > > > > > >
> > > > > > > > 1. How about execution.checkpointing.storage.type instead of
> > > > > > > > execution.checkpointing.storage?
> > > > > > > >
> > > > > > > > It's similar to state.backend.type.
> > > > > > > >
> > > > > > > > 2. How about execution.checkpointing.local-copy.enabled instead
> > > of
> > > > > > > > execution.checkpointing.local-copy?
> > > > > > > >
> > > > > > > > You added a new option: execution.checkpointing.local-copy.dir.
> > > > > > > > IIUC, one option name shouldn't be the prefix of other options.
> > > > > > > > If you add a new option execution.checkpointing.local-copy,
> > > > > > > > flink CI will fail directly.
> > > > > > > >
> > > > > > > > 3. execution.checkpointing.savepoint.dir is a little weird.
> > > > > > > >
> > > > > > > > For old options: state.savepoints.dir and
> > state.checkpoints.dir,
> > > > > > > > the savepoint and checkpoint are the same level. It means
> > > > > > > > it's a checkpoint or savepoint.
> > > > > > > >
> > > > > > > > The new option execution.checkpointing.dir is fine for me.
> > > > > > > > However, execution.checkpointing.savepoint.dir is a little
> > weird.
> > > > > > > > I don't know which name is better now. Let us think about it
> > > more.
> > > > > > > >
> > > > > > > > 4. How about execution.recovery.claim-mode instead of
> > > > > > > > execution.recovery.mode?
> > > > > > > >
> > > > > > > > The meaning of mode is too broad. The claim-mode may
> > > > > > > > be more accurate for users.
> > > > > > > >
> > > > > > > > WDYT?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Rui
> > > > > > > >
> > > > > > > > On Mon, Dec 25, 2023 at 5:14 PM Zakelly Lan <
> > > zakelly....@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi devs,
> > > > > > > > >
> > > > > > > > > I'd like to start a discussion on FLIP-406: Reorganize State
> > &
> > > > > > > > > Checkpointing & Recovery Configuration[1].
> > > > > > > > >
> > > > > > > > > Currently, the configuration options pertaining to
> > > checkpointing,
> > > > > > > > recovery,
> > > > > > > > > and state management are primarily grouped under the
> > following
> > > > > > > prefixes:
> > > > > > > > >
> > > > > > > > >    - state.backend.* : configurations related to state
> > > accessing
> > > > > and
> > > > > > > > >    checkpointing, as well as specific options for individual
> > > > state
> > > > > > > > backends
> > > > > > > > >    - execution.checkpointing.* : configurations associated
> > with
> > > > > > > > checkpoint
> > > > > > > > >    execution and recovery
> > > > > > > > >    - execution.savepoint.*: configurations for recovery from
> > > > > savepoint
> > > > > > > > >
> > > > > > > > > In addition, there are several individual options such as '
> > > > > > > > > *state.checkpoint-storage*' and '*state.checkpoints.dir*'
> > that
> > > > fall
> > > > > > > > outside
> > > > > > > > > of these prefixes. The current arrangement of these options,
> > > > which
> > > > > span
> > > > > > > > > multiple modules, is somewhat haphazard and lacks a
> > systematic
> > > > > > > structure.
> > > > > > > > > For example, the options under the '*CheckpointingOptions*'
> > > and '
> > > > > > > > > *ExecutionCheckpointingOptions*' are related and have no
> > clear
> > > > > > > boundaries
> > > > > > > > > from the user's perspective, but there is no unified prefix
> > for
> > > > > them.
> > > > > > > > With
> > > > > > > > > the upcoming release of Flink 2.0, we have an excellent
> > > > > opportunity to
> > > > > > > > > overhaul and restructure the configurations related to
> > > > > checkpointing,
> > > > > > > > > recovery, and state management. This FLIP proposes to
> > > reorganize
> > > > > these
> > > > > > > > > settings, making it more coherent by module, which would
> > > > > significantly
> > > > > > > > > lower the barriers for understanding and reduce the
> > development
> > > > > costs
> > > > > > > > > moving forward.
> > > > > > > > >
> > > > > > > > > Looking forward to hearing from you!
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Zakelly
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >

Reply via email to