Hi Zakelly, Thank you for creating the FLIP and starting the discussion.
The current arrangement of these options is indeed somewhat haphazard, and the new arrangement looks much better. I have some questions about the arrangement of some new configuration options: 1. For some state backends that do not support incremental checkpoint, how does the execution.checkpointing.incrementaloption take effect? Or is it better to put incremental under state.backend.xxx.incremental? 2. I'm a little worried that putting all configurations into `ExecutionCheckpointingOptions` will introduce some dependency problems. Some options would be used by flink-runtime module, but flink-runtime should not depend on flink-streaming-java. e.g. FLINK-28286[1]. So, I prefer to move configurations to `CheckpointingOptions`, WDYT? [1] https://issues.apache.org/jira/browse/FLINK-28286 -- Best, Yanfei Zakelly Lan <[email protected]> 于2023年12月25日周一 21:14写道: > > Hi Rui Fan and Junrui, > > Thanks for the reminder! I agree to change the > 'execution.checkpointing.local-copy' to > 'execution.checkpointing.local-copy.enabled'. > > And for other suggestions Rui proposed: > > 1. How about execution.checkpointing.storage.type instead > > of execution.checkpointing.storage? > > > Ah, I missed something here. Actually I suggest we could merge the current > 'state.checkpoints.dir' and 'state.checkpoint-storage' into one URI > configuration named 'execution.checkpointing.dir'. WDYT? > > 3. execution.checkpointing.savepoint.dir is a little weird. > > > > Yes, I think it is better to make 'savepoint' and 'checkpoint' the same > level. But I'm not so sure since there is only one savepoint-related > option. Maybe someone else could share some thoughts here. > > 4. How about execution.recovery.claim-mode instead of > > execution.recovery.mode? > > > > Agreed. That's more accurate. > > > Many thanks for your suggestions! > > Best, > Zakelly > > On Mon, Dec 25, 2023 at 8:18 PM Junrui Lee <[email protected]> wrote: > > > Hi Zakelly, > > > > Thanks for driving this. I agree that the proposed restructuring of the > > configuration options is largely positive. It will make understanding and > > working with Flink configurations more intuitive. > > > > Most of the proposed changes look great. Just a heads-up, as Rui Fan > > mentioned, Flink currently requires that no configOption's key be the > > prefix of another to avoid issues when we eventually adopt a standard YAML > > parser, as detailed in FLINK-29372 ( > > https://issues.apache.org/jira/browse/FLINK-29372). Therefore, it's better > > to change the key 'execution.checkpointing.local-copy' because it serves as > > a prefix to the key 'execution.checkpointing.local-copy.dir'. > > > > Best regards, > > Junrui > > > > Rui Fan <[email protected]> 于2023年12月25日周一 19:11写道: > > > > > Hi Zakelly, > > > > > > Thank you for driving this proposal! > > > > > > Overall good for me. I have some questions about these names. > > > > > > 1. How about execution.checkpointing.storage.type instead of > > > execution.checkpointing.storage? > > > > > > It's similar to state.backend.type. > > > > > > 2. How about execution.checkpointing.local-copy.enabled instead of > > > execution.checkpointing.local-copy? > > > > > > You added a new option: execution.checkpointing.local-copy.dir. > > > IIUC, one option name shouldn't be the prefix of other options. > > > If you add a new option execution.checkpointing.local-copy, > > > flink CI will fail directly. > > > > > > 3. execution.checkpointing.savepoint.dir is a little weird. > > > > > > For old options: state.savepoints.dir and state.checkpoints.dir, > > > the savepoint and checkpoint are the same level. It means > > > it's a checkpoint or savepoint. > > > > > > The new option execution.checkpointing.dir is fine for me. > > > However, execution.checkpointing.savepoint.dir is a little weird. > > > I don't know which name is better now. Let us think about it more. > > > > > > 4. How about execution.recovery.claim-mode instead of > > > execution.recovery.mode? > > > > > > The meaning of mode is too broad. The claim-mode may > > > be more accurate for users. > > > > > > WDYT? > > > > > > Best, > > > Rui > > > > > > On Mon, Dec 25, 2023 at 5:14 PM Zakelly Lan <[email protected]> > > wrote: > > > > > > > Hi devs, > > > > > > > > I'd like to start a discussion on FLIP-406: Reorganize State & > > > > Checkpointing & Recovery Configuration[1]. > > > > > > > > Currently, the configuration options pertaining to checkpointing, > > > recovery, > > > > and state management are primarily grouped under the following > > prefixes: > > > > > > > > - state.backend.* : configurations related to state accessing and > > > > checkpointing, as well as specific options for individual state > > > backends > > > > - execution.checkpointing.* : configurations associated with > > > checkpoint > > > > execution and recovery > > > > - execution.savepoint.*: configurations for recovery from savepoint > > > > > > > > In addition, there are several individual options such as ' > > > > *state.checkpoint-storage*' and '*state.checkpoints.dir*' that fall > > > outside > > > > of these prefixes. The current arrangement of these options, which span > > > > multiple modules, is somewhat haphazard and lacks a systematic > > structure. > > > > For example, the options under the '*CheckpointingOptions*' and ' > > > > *ExecutionCheckpointingOptions*' are related and have no clear > > boundaries > > > > from the user's perspective, but there is no unified prefix for them. > > > With > > > > the upcoming release of Flink 2.0, we have an excellent opportunity to > > > > overhaul and restructure the configurations related to checkpointing, > > > > recovery, and state management. This FLIP proposes to reorganize these > > > > settings, making it more coherent by module, which would significantly > > > > lower the barriers for understanding and reduce the development costs > > > > moving forward. > > > > > > > > Looking forward to hearing from you! > > > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560 > > > > > > > > Best, > > > > Zakelly > > > > > > > > >
