Hi everyone,

Thanks all for your comments!

@Yanfei

> 1. For some state backends that do not support incremental checkpoint,
> how does the execution.checkpointing.incrementaloption take effect? Or
> is it better to put incremental under state.backend.xxx.incremental?
>
I'd rather not put the option for incremental checkpoint under the
'state.backend', since it is more about the checkpointing instead of state
accessing. Of course, the state backend may not necessarily do incremental
checkpoint as requested. If the state backend is not capable of taking
incremental cp, it is better to fallback to the full cp.

2. I'm a little worried that putting all configurations into
> `ExecutionCheckpointingOptions` will introduce some dependency
> problems. Some options would be used by flink-runtime module, but
> flink-runtime should not depend on flink-streaming-java. e.g.
> FLINK-28286[1].
> So, I prefer to move configurations to `CheckpointingOptions`, WDYT?
>

Yes, that's a very good point.  Moving to
`CheckpointingOptions`(flink-core) makes sense.

@Lijie

How about
> state.savepoints.dir -> execution.checkpointing.savepoint.dir
> state.checkpoints.dir -> execution.checkpointing.checkpoint.dir


Actually, I think the `checkpointing.checkpoint` may cause some confusion.
But I'm ok if others agree.
I'm wondering if `execution.checkpointing.savepoint-dir` would be better.
WDYT?

2. We changed the execution.checkpointing.local-copy' to
> 'execution.checkpointing.local-copy.enabled'. Should we also add "enabled"
> suffix for other boolean type configuration options ? For example,
> execution.checkpointing.incremental ->
> execution.checkpointing.incremental.enabled
>

Actually, the incremental cp is something like choosing a mode for doing
checkpoint instead of enabling a function. So I think an enumeration option
`execution.checkpointing.mode` which can be 'full' (default) or
'incremental' would be better, WDYT?
And @Rui Fan @Yanfei What do you think about this?


On Tue, Dec 26, 2023 at 5:15 PM Lijie Wang <wangdachui9...@gmail.com> wrote:

> Hi Zakelly,
>
> Thanks for driving the discussion.
>
> 1.
> >> But I'm not so sure since there is only one savepoint-related option.
> Maybe someone else could share some thoughts here.
>
> How about
> state.savepoints.dir -> execution.checkpointing.savepoint.dir
> state.checkpoints.dir -> execution.checkpointing.checkpoint.dir
>
> 2. We changed the execution.checkpointing.local-copy' to
> 'execution.checkpointing.local-copy.enabled'. Should we also add "enabled"
> suffix for other boolean type configuration options ? For example,
> execution.checkpointing.incremental ->
> execution.checkpointing.incremental.enabled
>
> In this way, the naming style of configuration options is unified, and it
> can avoid potential similar problems (for example, we may need to add more
> options for incremental checkpoint in the future).
>
> Best,
> Lijie
>
> Yanfei Lei <fredia...@gmail.com> 于2023年12月26日周二 12:05写道:
>
> > Hi Zakelly,
> >
> > Thank you for creating the FLIP and starting the discussion.
> >
> > The current arrangement of these options is indeed somewhat haphazard,
> > and the new arrangement looks much better. I have some questions about
> > the arrangement of some new configuration options:
> >
> > 1. For some state backends that do not support incremental checkpoint,
> > how does the execution.checkpointing.incrementaloption take effect? Or
> > is it better to put incremental under state.backend.xxx.incremental?
> >
> > 2. I'm a little worried that putting all configurations into
> > `ExecutionCheckpointingOptions` will introduce some dependency
> > problems. Some options would be used by flink-runtime module, but
> > flink-runtime should not depend on flink-streaming-java. e.g.
> > FLINK-28286[1].
> > So, I prefer to move configurations to `CheckpointingOptions`, WDYT?
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-28286
> >
> > --
> > Best,
> > Yanfei
> >
> > Zakelly Lan <zakelly....@gmail.com> 于2023年12月25日周一 21:14写道:
> > >
> > > Hi Rui Fan and Junrui,
> > >
> > > Thanks for the reminder! I agree to change the
> > > 'execution.checkpointing.local-copy' to
> > > 'execution.checkpointing.local-copy.enabled'.
> > >
> > > And for other suggestions Rui proposed:
> > >
> > > 1. How about execution.checkpointing.storage.type instead
> > > > of execution.checkpointing.storage?
> > >
> > >
> > > Ah, I missed something here. Actually I suggest we could merge the
> > current
> > > 'state.checkpoints.dir' and 'state.checkpoint-storage' into one URI
> > > configuration named 'execution.checkpointing.dir'. WDYT?
> > >
> > > 3. execution.checkpointing.savepoint.dir is a little weird.
> > > >
> > >
> > > Yes, I think it is better to make 'savepoint' and 'checkpoint' the same
> > > level. But I'm not so sure since there is only one savepoint-related
> > > option. Maybe someone else could share some thoughts here.
> > >
> > > 4. How about execution.recovery.claim-mode instead of
> > > > execution.recovery.mode?
> > > >
> > >
> > >  Agreed. That's more accurate.
> > >
> > >
> > > Many thanks for your suggestions!
> > >
> > > Best,
> > > Zakelly
> > >
> > > On Mon, Dec 25, 2023 at 8:18 PM Junrui Lee <jrlee....@gmail.com>
> wrote:
> > >
> > > > Hi Zakelly,
> > > >
> > > > Thanks for driving this. I agree that the proposed restructuring of
> the
> > > > configuration options is largely positive. It will make understanding
> > and
> > > > working with Flink configurations more intuitive.
> > > >
> > > > Most of the proposed changes look great. Just a heads-up, as Rui Fan
> > > > mentioned, Flink currently requires that no configOption's key be the
> > > > prefix of another to avoid issues when we eventually adopt a standard
> > YAML
> > > > parser, as detailed in FLINK-29372 (
> > > > https://issues.apache.org/jira/browse/FLINK-29372). Therefore, it's
> > better
> > > > to change the key 'execution.checkpointing.local-copy' because it
> > serves as
> > > > a prefix to the key 'execution.checkpointing.local-copy.dir'.
> > > >
> > > > Best regards,
> > > > Junrui
> > > >
> > > > Rui Fan <1996fan...@gmail.com> 于2023年12月25日周一 19:11写道:
> > > >
> > > > > Hi Zakelly,
> > > > >
> > > > > Thank you for driving this proposal!
> > > > >
> > > > > Overall good for me. I have some questions about these names.
> > > > >
> > > > > 1. How about execution.checkpointing.storage.type instead of
> > > > > execution.checkpointing.storage?
> > > > >
> > > > > It's similar to state.backend.type.
> > > > >
> > > > > 2. How about execution.checkpointing.local-copy.enabled instead of
> > > > > execution.checkpointing.local-copy?
> > > > >
> > > > > You added a new option: execution.checkpointing.local-copy.dir.
> > > > > IIUC, one option name shouldn't be the prefix of other options.
> > > > > If you add a new option execution.checkpointing.local-copy,
> > > > > flink CI will fail directly.
> > > > >
> > > > > 3. execution.checkpointing.savepoint.dir is a little weird.
> > > > >
> > > > > For old options: state.savepoints.dir and state.checkpoints.dir,
> > > > > the savepoint and checkpoint are the same level. It means
> > > > > it's a checkpoint or savepoint.
> > > > >
> > > > > The new option execution.checkpointing.dir is fine for me.
> > > > > However, execution.checkpointing.savepoint.dir is a little weird.
> > > > > I don't know which name is better now. Let us think about it more.
> > > > >
> > > > > 4. How about execution.recovery.claim-mode instead of
> > > > > execution.recovery.mode?
> > > > >
> > > > > The meaning of mode is too broad. The claim-mode may
> > > > > be more accurate for users.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Mon, Dec 25, 2023 at 5:14 PM Zakelly Lan <zakelly....@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > I'd like to start a discussion on FLIP-406: Reorganize State &
> > > > > > Checkpointing & Recovery Configuration[1].
> > > > > >
> > > > > > Currently, the configuration options pertaining to checkpointing,
> > > > > recovery,
> > > > > > and state management are primarily grouped under the following
> > > > prefixes:
> > > > > >
> > > > > >    - state.backend.* : configurations related to state accessing
> > and
> > > > > >    checkpointing, as well as specific options for individual
> state
> > > > > backends
> > > > > >    - execution.checkpointing.* : configurations associated with
> > > > > checkpoint
> > > > > >    execution and recovery
> > > > > >    - execution.savepoint.*: configurations for recovery from
> > savepoint
> > > > > >
> > > > > > In addition, there are several individual options such as '
> > > > > > *state.checkpoint-storage*' and '*state.checkpoints.dir*' that
> fall
> > > > > outside
> > > > > > of these prefixes. The current arrangement of these options,
> which
> > span
> > > > > > multiple modules, is somewhat haphazard and lacks a systematic
> > > > structure.
> > > > > > For example, the options under the '*CheckpointingOptions*' and '
> > > > > > *ExecutionCheckpointingOptions*' are related and have no clear
> > > > boundaries
> > > > > > from the user's perspective, but there is no unified prefix for
> > them.
> > > > > With
> > > > > > the upcoming release of Flink 2.0, we have an excellent
> > opportunity to
> > > > > > overhaul and restructure the configurations related to
> > checkpointing,
> > > > > > recovery, and state management. This FLIP proposes to reorganize
> > these
> > > > > > settings, making it more coherent by module, which would
> > significantly
> > > > > > lower the barriers for understanding and reduce the development
> > costs
> > > > > > moving forward.
> > > > > >
> > > > > > Looking forward to hearing from you!
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560
> > > > > >
> > > > > > Best,
> > > > > > Zakelly
> > > > > >
> > > > >
> > > >
> >
>

Reply via email to