Hi everyone, It has been 6 days since the last call for discussion. I'd like to start a vote after another 2 days.
Please let me know if you have any concerns. Thanks! Best, Zakelly On Tue, Jan 16, 2024 at 2:54 PM Zakelly Lan <zakelly....@gmail.com> wrote: > Thanks for the suggestion Rui! The type is added. > > > Best, > Zakelly > > On Tue, Jan 16, 2024 at 2:33 PM Rui Fan <1996fan...@gmail.com> wrote: > >> Hi Zakelly, >> >> Would you mind adding the option type in the FLIP doc? >> For example, String, Boolean or Enum, etc. Thank you. >> >> Best, >> Rui >> >> On Tue, Jan 16, 2024 at 2:29 PM Zakelly Lan <zakelly....@gmail.com> >> wrote: >> >> > Hi everyone, >> > >> > Thanks all for joining the discussion! I'd like to speed this up since >> it >> > lasts for nearly a month. I made changes on this FLIP based on >> suggestions >> > and compromises acceptable to most people. Please feel free to give your >> > opinion. Thanks! >> > If there are no more suggestions, I will consider starting a vote >> within a >> > week. >> > >> > >> > Best, >> > Zakelly >> > >> > On Thu, Jan 11, 2024 at 10:31 AM Xuannan Su <suxuanna...@gmail.com> >> wrote: >> > >> > > Hi Zakelly, >> > > >> > > I am fine with either Option 2 or Option 3. I think the naming in >> > > Option 2 makes it clear that it is a boolean configuration. However, >> > > most of the currently available boolean configurations do not use >> > > "enable" as a suffix. Therefore, Option 3 looks good to me as it >> > > follows the current practice. >> > > >> > > Best regards, >> > > Xuannan >> > > >> > > On Thu, Jan 11, 2024 at 9:50 AM Hangxiang Yu <master...@gmail.com> >> > wrote: >> > > > >> > > > > >> > > > > That's a very good point. I realize that the word 'recovery' means >> > way >> > > too >> > > > > many things. So I suggest picking a more specific word here, how >> > about >> > > > > 'execution.state-recovery.*' ? Checkpointing and state recovery >> are >> > > > > corresponding terms and won't make ambiguity. >> > > > > >> > > > >> > > > This makes the configuration clearer to me. We could focus on the >> > > > `state-recovery` at first. >> > > > >> > > > I think we could create another FLIP for the deprecation of LEGACY >> > mode. >> > > > > >> > > > >> > > > LGTM, Let's create a new FLIP to do this. >> > > > >> > > > IIUC, there is no clear ownership of the local copy files from the >> > > previous >> > > > > job and it's better to define one. This needs more discussion so >> we >> > > could >> > > > > create another thread for this. WDYT? >> > > > > >> > > > >> > > > Yeah, I have created a new ticket FLINK-34032 to track and discuss >> > this. >> > > > >> > > > On Wed, Jan 10, 2024 at 6:31 PM Zakelly Lan <zakelly....@gmail.com> >> > > wrote: >> > > > >> > > > > Hi everyone, >> > > > > >> > > > > It seems we still don't have a consensus on the rules for boolean >> > type >> > > > > options. Let me recap the alternatives we have: >> > > > > >> > > > > Option 1: Use enumeration options instead if possible. But this >> may >> > > cause >> > > > > some name collisions or confusion as we discussed and we should >> unify >> > > the >> > > > > statement everywhere. >> > > > > Option 2: Use boolean options and add 'enabled' as the suffix. >> > > > > Option 3: Use boolean options and ONLY add 'enabled' when there >> are >> > > more >> > > > > detailed configurations under the same prefix, to prevent one name >> > from >> > > > > serving as a prefix to another. >> > > > > >> > > > > I am inclined to Option 3, since it is more in line with current >> > > practice >> > > > > and friendly for existing users. Also It reduces the length of >> > > > > configuration names as much as possible. >> > > > > >> > > > > Looking forward to your opinions! Thanks! >> > > > > >> > > > > >> > > > > Best, >> > > > > Zakelly >> > > > > >> > > > > On Wed, Jan 10, 2024 at 3:30 PM Zakelly Lan < >> zakelly....@gmail.com> >> > > wrote: >> > > > > >> > > > > > Hi Hangxiang, >> > > > > > >> > > > > > Thanks for your suggestions! >> > > > > > >> > > > > > 1. Could execution.recovery also contain some other behaviors >> about >> > > > > >> recovery ? e.g. restart-strategy. >> > > > > > >> > > > > > >> > > > > > That's a very good point. I realize that the word 'recovery' >> means >> > > way >> > > > > too >> > > > > > many things. So I suggest picking a more specific word here, how >> > > about >> > > > > > 'execution.state-recovery.*' ? Checkpointing and state recovery >> are >> > > > > > corresponding terms and won't make ambiguity. >> > > > > > >> > > > > > 2. Could we also remove some legacy configuration value ? e.g. >> > LEGACY >> > > > > Mode >> > > > > >> for >> > execution.savepoint-restore-mode/execution.recovery.claim-mode. >> > > > > > >> > > > > > >> > > > > > I think we could create another FLIP for the deprecation of >> LEGACY >> > > mode. >> > > > > > >> > > > > > >> > > > > >> 3. Could the local checkpoint be cleaned >> > > > > >> if execution.checkpointing.local-copy.enabled is true and >> > > > > >> execution.recovery.from-local is false ? I found it's also an >> > issue >> > > if >> > > > > >> current local-recovery from enabled to disabled. Maybe another >> > > ticket is >> > > > > >> needed. >> > > > > > >> > > > > > >> > > > > > IIUC, there is no clear ownership of the local copy files from >> the >> > > > > > previous job and it's better to define one. This needs more >> > > discussion so >> > > > > > we could create another thread for this. WDYT? >> > > > > > >> > > > > > >> > > > > > Best, >> > > > > > Zakelly >> > > > > > >> > > > > > On Tue, Jan 9, 2024 at 11:23 AM Hangxiang Yu < >> master...@gmail.com> >> > > > > wrote: >> > > > > > >> > > > > >> Hi, Zakelly. >> > > > > >> Thanks for driving this. Overall LGTM as we discussed offline. >> > > > > >> >> > > > > >> Some comments/suggestions just came to mind: >> > > > > >> 1. Could execution.recovery also contain some other behaviors >> > about >> > > > > >> recovery ? e.g. restart-strategy. >> > > > > >> 2. Could we also remove some legacy configuration value ? e.g. >> > > LEGACY >> > > > > Mode >> > > > > >> for >> > execution.savepoint-restore-mode/execution.recovery.claim-mode. >> > > > > >> 3. Could the local checkpoint be cleaned >> > > > > >> if execution.checkpointing.local-copy.enabled is true and >> > > > > >> execution.recovery.from-local is false ? I found it's also an >> > issue >> > > if >> > > > > >> current local-recovery from enabled to disabled. Maybe another >> > > ticket is >> > > > > >> needed. >> > > > > >> 4. +1 for enabling execution.checkpointing.incremental by >> default >> > > which >> > > > > is >> > > > > >> basically default configuration in our production environment. >> > > > > >> >> > > > > >> >> > > > > >> On Mon, Jan 8, 2024 at 6:06 PM Zakelly Lan < >> zakelly....@gmail.com >> > > >> > > > > wrote: >> > > > > >> >> > > > > >> > Hi Yun, >> > > > > >> > >> > > > > >> > Thanks for your comments! >> > > > > >> > >> > > > > >> > 1. We shall not describe the configuration with its >> > > implementation >> > > > > for >> > > > > >> > > 'execution.checkpointing.local-copy.*' options, for hashmap >> > > > > >> > state-backend, >> > > > > >> > > it would write two streams and for Rocksdb state-backend, >> it >> > > would >> > > > > use >> > > > > >> > > hard-link for backup. Thus, I think >> > > > > >> > > 'execution.checkpointing.local-backup.*' looks better. >> > > > > >> > >> > > > > >> > I agreed that we'd better name the option in user's >> perspective >> > > > > instead >> > > > > >> of >> > > > > >> > the implementation, thus I name it as a copy of the >> checkpoint >> > in >> > > the >> > > > > >> > local disk, regardless of the way of generating it. The word >> > > 'backup' >> > > > > is >> > > > > >> > also suitable for this case, so I agree to change to >> > > > > >> > 'execution.checkpointing.local-backup.*' if no one objects. >> > > > > >> > >> > > > > >> > 2. What does the >> > 'execution.checkpointing.data-inline-threshold' >> > > > > >> mean? It >> > > > > >> > > seems not so easy to understand. >> > > > > >> > >> > > > > >> > The 'execution.checkpointing.data-inline-threshold' (original >> > one >> > > as >> > > > > >> > 'state.storage.fs.memory-threshold') stands for the size >> > threshold >> > > > > below >> > > > > >> > which state chunks will store inline with the metadata, thus >> I >> > > call it >> > > > > >> > 'data-inline-threshold'. >> > > > > >> > >> > > > > >> > >> > > > > >> > Best, >> > > > > >> > Zakelly >> > > > > >> > >> > > > > >> > On Mon, Jan 8, 2024 at 10:09 AM Yun Tang <myas...@live.com> >> > > wrote: >> > > > > >> > >> > > > > >> > > Hi Zakelly, >> > > > > >> > > >> > > > > >> > > Thanks for driving this topic. I have two concerns here: >> > > > > >> > > >> > > > > >> > > 1. We shall not describe the configuration with its >> > > > > implementation >> > > > > >> for >> > > > > >> > > 'execution.checkpointing.local-copy.*' options, for hashmap >> > > > > >> > state-backend, >> > > > > >> > > it would write two streams and for Rocksdb state-backend, >> it >> > > would >> > > > > use >> > > > > >> > > hard-link for backup. Thus, I think >> > > > > >> > > 'execution.checkpointing.local-backup.*' looks better. >> > > > > >> > > 2. What does the >> > > 'execution.checkpointing.data-inline-threshold' >> > > > > >> mean? >> > > > > >> > > It seems not so easy to understand. >> > > > > >> > > >> > > > > >> > > Best >> > > > > >> > > Yun Tang >> > > > > >> > > ________________________________ >> > > > > >> > > From: Piotr Nowojski <pnowoj...@apache.org> >> > > > > >> > > Sent: Thursday, January 4, 2024 22:37 >> > > > > >> > > To: dev@flink.apache.org <dev@flink.apache.org> >> > > > > >> > > Subject: Re: [DISCUSS] FLIP-406: Reorganize State & >> > > Checkpointing & >> > > > > >> > > Recovery Configuration >> > > > > >> > > >> > > > > >> > > Hi, >> > > > > >> > > >> > > > > >> > > Thanks for trying to clean this up! I don't have strong >> > > opinions on >> > > > > >> the >> > > > > >> > > topics discussed here, so generally speaking +1 from my >> side! >> > > > > >> > > >> > > > > >> > > Best, >> > > > > >> > > Piotrek >> > > > > >> > > >> > > > > >> > > śr., 3 sty 2024 o 04:16 Rui Fan <1996fan...@gmail.com> >> > > napisał(a): >> > > > > >> > > >> > > > > >> > > > Thanks for the feedback! >> > > > > >> > > > >> > > > > >> > > > Using the `execution.checkpointing.incremental.enabled`, >> > > > > >> > > > and enabling it by default sounds good to me. >> > > > > >> > > > >> > > > > >> > > > Best, >> > > > > >> > > > Rui >> > > > > >> > > > >> > > > > >> > > > On Wed, Jan 3, 2024 at 11:10 AM Zakelly Lan < >> > > > > zakelly....@gmail.com> >> > > > > >> > > wrote: >> > > > > >> > > > >> > > > > >> > > > > Hi Rui, >> > > > > >> > > > > >> > > > > >> > > > > Thanks for your comments! >> > > > > >> > > > > >> > > > > >> > > > > IMO, given that the state backend can be plugably >> loaded >> > > (as you >> > > > > >> can >> > > > > >> > > > > specify a state backend factory), I prefer not >> providing >> > > state >> > > > > >> > backend >> > > > > >> > > > > specified options in the framework. >> > > > > >> > > > > >> > > > > >> > > > > Secondly, the incremental checkpoint is actually a >> sharing >> > > file >> > > > > >> > > strategy >> > > > > >> > > > > across checkpoints, which means the state backend >> *could* >> > > reuse >> > > > > >> files >> > > > > >> > > > from >> > > > > >> > > > > previous cp but not *must* do so. When the state >> backend >> > > could >> > > > > not >> > > > > >> > > reuse >> > > > > >> > > > > the files, it is reasonable to fallback to a full >> > > checkpoint. >> > > > > >> > > > > >> > > > > >> > > > > Thus, I suggest we make it >> > > `execution.checkpointing.incremental` >> > > > > >> and >> > > > > >> > > > enable >> > > > > >> > > > > it by default. For those state backends not supporting >> > this, >> > > > > they >> > > > > >> > > perform >> > > > > >> > > > > full checkpoints and print a warning to inform users. >> > Users >> > > do >> > > > > not >> > > > > >> > need >> > > > > >> > > > to >> > > > > >> > > > > pay special attention to different options to control >> this >> > > > > across >> > > > > >> > > > different >> > > > > >> > > > > state backends. This is more user-friendly in my >> opinion. >> > > WDYT? >> > > > > >> > > > > >> > > > > >> > > > > On Tue, Jan 2, 2024 at 10:49 AM Rui Fan < >> > > 1996fan...@gmail.com> >> > > > > >> > wrote: >> > > > > >> > > > > >> > > > > >> > > > > > Hi Zakelly, >> > > > > >> > > > > > >> > > > > >> > > > > > I'm not sure whether we could add the state backend >> type >> > > in >> > > > > the >> > > > > >> > > > > > new key name of state.backend.incremental. It means >> we >> > use >> > > > > >> > > > > > `execution.checkpointing.rocksdb-incremental` or >> > > > > >> > > > > > >> `execution.checkpointing.rocksdb-incremental.enabled`. >> > > > > >> > > > > > >> > > > > >> > > > > > So far, state.backend.incremental only works for >> rocksdb >> > > state >> > > > > >> > > backend. >> > > > > >> > > > > > And this feature or optimization is very valuable and >> > > huge for >> > > > > >> > large >> > > > > >> > > > > > state flink jobs. I believe it's enabled for most >> > > production >> > > > > >> flink >> > > > > >> > > jobs >> > > > > >> > > > > > with large rocksdb state. >> > > > > >> > > > > > >> > > > > >> > > > > > If this option isn't generic for all state backend >> > types, >> > > I >> > > > > >> guess >> > > > > >> > we >> > > > > >> > > > > > can enable >> > > > > `execution.checkpointing.rocksdb-incremental.enabled` >> > > > > >> > > > > > by default in Flink 2.0. >> > > > > >> > > > > > >> > > > > >> > > > > > But if it works for all state backends, it's hard to >> > > enable it >> > > > > >> by >> > > > > >> > > > > default. >> > > > > >> > > > > > Enabling great and valuable features or improvements >> are >> > > > > useful >> > > > > >> > > > > > for users, especially a lot of new flink users. >> > > Out-of-the-box >> > > > > >> > > options >> > > > > >> > > > > > are good for users. >> > > > > >> > > > > > >> > > > > >> > > > > > WDYT? >> > > > > >> > > > > > >> > > > > >> > > > > > Best, >> > > > > >> > > > > > Rui >> > > > > >> > > > > > >> > > > > >> > > > > > On Fri, Dec 29, 2023 at 1:45 PM Zakelly Lan < >> > > > > >> zakelly....@gmail.com >> > > > > >> > > >> > > > > >> > > > > wrote: >> > > > > >> > > > > > >> > > > > >> > > > > > > Hi everyone, >> > > > > >> > > > > > > >> > > > > >> > > > > > > Thanks all for your comments! >> > > > > >> > > > > > > >> > > > > >> > > > > > > As many of you have questions about the names for >> > > boolean >> > > > > >> > options, >> > > > > >> > > I >> > > > > >> > > > > > > suggest we make a naming rule for them. For now I >> > could >> > > > > think >> > > > > >> of >> > > > > >> > > > three >> > > > > >> > > > > > > options: >> > > > > >> > > > > > > >> > > > > >> > > > > > > Option 1: Use enumeration options if possible. But >> > this >> > > may >> > > > > >> cause >> > > > > >> > > > some >> > > > > >> > > > > > name >> > > > > >> > > > > > > collisions or confusion as we discussed and we >> should >> > > unify >> > > > > >> the >> > > > > >> > > > > statement >> > > > > >> > > > > > > everywhere. >> > > > > >> > > > > > > Option 2: Use boolean options and add 'enabled' as >> the >> > > > > suffix. >> > > > > >> > > > > > > Option 3: Use boolean options and ONLY add >> 'enabled' >> > > when >> > > > > >> there >> > > > > >> > are >> > > > > >> > > > > more >> > > > > >> > > > > > > detailed configurations under the same prefix, to >> > > prevent >> > > > > one >> > > > > >> > name >> > > > > >> > > > from >> > > > > >> > > > > > > serving as a prefix to another. >> > > > > >> > > > > > > >> > > > > >> > > > > > > I am slightly inclined to Option 3, since it is >> more >> > in >> > > line >> > > > > >> with >> > > > > >> > > > > current >> > > > > >> > > > > > > practice and friendly for existing users. Also It >> > > reduces >> > > > > the >> > > > > >> > > length >> > > > > >> > > > of >> > > > > >> > > > > > > configuration names as much as possible. I really >> want >> > > to >> > > > > hear >> > > > > >> > your >> > > > > >> > > > > > > opinions. >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > @Xuannan >> > > > > >> > > > > > > >> > > > > >> > > > > > > I agree with your comments 1 and 3. >> > > > > >> > > > > > > >> > > > > >> > > > > > > For 2, If we decide to change the name, maybe >> > > > > >> > > > > > > `execution.checkpointing.parallel-cleaner` is >> better? >> > > And as >> > > > > >> for >> > > > > >> > > > > whether >> > > > > >> > > > > > to >> > > > > >> > > > > > > add 'enabled' I suggest we discuss the rule above. >> > WDYT? >> > > > > >> > > > > > > Thanks! >> > > > > >> > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > > Best, >> > > > > >> > > > > > > Zakelly >> > > > > >> > > > > > > >> > > > > >> > > > > > > On Fri, Dec 29, 2023 at 12:02 PM Xuannan Su < >> > > > > >> > suxuanna...@gmail.com >> > > > > >> > > > >> > > > > >> > > > > > wrote: >> > > > > >> > > > > > > >> > > > > >> > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > Thanks for driving this! The organization of the >> > > > > >> configuration >> > > > > >> > > > option >> > > > > >> > > > > > > > in the FLIP looks much cleaner and easier to >> > > understand. >> > > > > +1 >> > > > > >> to >> > > > > >> > > the >> > > > > >> > > > > > > > FLIP. >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > Just some questions from me. >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > 1. I think the change to the ConfigOptions >> should be >> > > put >> > > > > in >> > > > > >> the >> > > > > >> > > > > > > > `Public Interface` section, instead of `Proposed >> > > Changed`, >> > > > > >> as >> > > > > >> > > those >> > > > > >> > > > > > > > configuration options are public interface. >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > 2. The key >> `state.checkpoint.cleaner.parallel-mode` >> > > seems >> > > > > >> > > > confusing. >> > > > > >> > > > > > > > It feels like it is used to choose different >> modes. >> > In >> > > > > >> fact, it >> > > > > >> > > is >> > > > > >> > > > a >> > > > > >> > > > > > > > boolean flag to indicate whether to enable >> parallel >> > > clean. >> > > > > >> How >> > > > > >> > > > about >> > > > > >> > > > > > > > making it >> > > > > `state.checkpoint.cleaner.parallel-mode.enabled`? >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > 3. The `execution.checkpointing.write-buffer` may >> > > better >> > > > > be >> > > > > >> > > > > > > > `execution.checkpointing.write-buffer-size` so >> that >> > we >> > > > > know >> > > > > >> it >> > > > > >> > is >> > > > > >> > > > > > > > configuring the size of the buffer. >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > Best, >> > > > > >> > > > > > > > Xuannan >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > On Wed, Dec 27, 2023 at 7:17 PM Yanfei Lei < >> > > > > >> > fredia...@gmail.com> >> > > > > >> > > > > > wrote: >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > > Considering the name occupation, how about >> > naming >> > > it >> > > > > as >> > > > > >> > > > > > > > `execution.checkpointing.type`? >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > `Checkpoint Type`[1,2] is used to describe >> > > > > >> aligned/unaligned >> > > > > >> > > > > > > > > checkpoint, I am inclined to make a choice >> between >> > > > > >> > > > > > > > > `execution.checkpointing.incremental` and >> > > > > >> > > > > > > > > `execution.checkpointing.incremental.enabled`. >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > [1] >> > > > > >> > > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > > >> > > >> > > > > >> > >> > > > > >> >> > > > > >> > > >> > >> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/monitoring/checkpoint_monitoring/ >> > > > > >> > > > > > > > > [2] >> > > > > >> > > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > > >> > > >> > > > > >> > >> > > > > >> >> > > > > >> > > >> > >> https://github.com/apache/flink/blob/master/flink-runtime-web/web-dashboard/src/app/pages/job/checkpoints/detail/job-checkpoints-detail.component.html#L27 >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > -- >> > > > > >> > > > > > > > > Best, >> > > > > >> > > > > > > > > Yanfei >> > > > > >> > > > > > > > > >> > > > > >> > > > > > > > > Zakelly Lan <zakelly....@gmail.com> >> > 于2023年12月27日周三 >> > > > > >> 14:41写道: >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > Hi Lijie, >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > Thanks for the reminder! I missed this. >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > Considering the name occupation, how about >> > naming >> > > it >> > > > > as >> > > > > >> > > > > > > > > > `execution.checkpointing.type`? >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > Actually I think the current >> > > > > >> `execution.checkpointing.mode` >> > > > > >> > > is >> > > > > >> > > > > > > > confusing in >> > > > > >> > > > > > > > > > some ways, maybe >> > > > > >> `execution.checkpointing.data-consistency` >> > > > > >> > > is >> > > > > >> > > > > > > better. >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > Best, >> > > > > >> > > > > > > > > > Zakelly >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > On Wed, Dec 27, 2023 at 12:59 PM Lijie Wang < >> > > > > >> > > > > > > wangdachui9...@gmail.com> >> > > > > >> > > > > > > > > > wrote: >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > >> I'm wondering if >> > > > > >> > `execution.checkpointing.savepoint-dir` >> > > > > >> > > > > would >> > > > > >> > > > > > > be >> > > > > >> > > > > > > > > > > better. >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > `execution.checkpointing.dir` and >> > > > > >> > > > > > > > `execution.checkpointing.savepoint-dir` >> > > > > >> > > > > > > > > > > are also fine for me. >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > >> So I think an enumeration option >> > > > > >> > > > > > `execution.checkpointing.mode` >> > > > > >> > > > > > > > which >> > > > > >> > > > > > > > > > > can be 'full' (default) or 'incremental' >> would >> > > be >> > > > > >> better >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > I agree with using an enumeration option. >> But >> > > > > >> currently >> > > > > >> > > there >> > > > > >> > > > > is >> > > > > >> > > > > > > > already a >> > > > > >> > > > > > > > > > > configuration option called >> > > > > >> > `execution.checkpointing.mode`, >> > > > > >> > > > > which >> > > > > >> > > > > > > is >> > > > > >> > > > > > > > used >> > > > > >> > > > > > > > > > > to choose EXACTLY_ONCE or AT_LEAST_ONCE. >> Maybe >> > > we >> > > > > >> need to >> > > > > >> > > use >> > > > > >> > > > > > > > another name >> > > > > >> > > > > > > > > > > or merge these two options. >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > Lijie >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > Zakelly Lan <zakelly....@gmail.com> >> > > 于2023年12月27日周三 >> > > > > >> > > 11:43写道: >> > > > > >> > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Hi everyone, >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Thanks all for your comments! >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > @Yanfei >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > 1. For some state backends that do not >> > > support >> > > > > >> > > > incremental >> > > > > >> > > > > > > > checkpoint, >> > > > > >> > > > > > > > > > > > > how does the >> > > > > >> > execution.checkpointing.incrementaloption >> > > > > >> > > > take >> > > > > >> > > > > > > > effect? Or >> > > > > >> > > > > > > > > > > > > is it better to put incremental under >> > > > > >> > > > > > > > state.backend.xxx.incremental? >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > I'd rather not put the option for >> > incremental >> > > > > >> > checkpoint >> > > > > >> > > > > under >> > > > > >> > > > > > > the >> > > > > >> > > > > > > > > > > > 'state.backend', since it is more about >> the >> > > > > >> > checkpointing >> > > > > >> > > > > > instead >> > > > > >> > > > > > > > of >> > > > > >> > > > > > > > > > > state >> > > > > >> > > > > > > > > > > > accessing. Of course, the state backend >> may >> > > not >> > > > > >> > > necessarily >> > > > > >> > > > > do >> > > > > >> > > > > > > > > > > incremental >> > > > > >> > > > > > > > > > > > checkpoint as requested. If the state >> > backend >> > > is >> > > > > not >> > > > > >> > > > capable >> > > > > >> > > > > of >> > > > > >> > > > > > > > taking >> > > > > >> > > > > > > > > > > > incremental cp, it is better to fallback >> to >> > > the >> > > > > full >> > > > > >> > cp. >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > 2. I'm a little worried that putting all >> > > > > >> configurations >> > > > > >> > > > into >> > > > > >> > > > > > > > > > > > > `ExecutionCheckpointingOptions` will >> > > introduce >> > > > > >> some >> > > > > >> > > > > > dependency >> > > > > >> > > > > > > > > > > > > problems. Some options would be used by >> > > > > >> flink-runtime >> > > > > >> > > > > module, >> > > > > >> > > > > > > but >> > > > > >> > > > > > > > > > > > > flink-runtime should not depend on >> > > > > >> > > flink-streaming-java. >> > > > > >> > > > > e.g. >> > > > > >> > > > > > > > > > > > > FLINK-28286[1]. >> > > > > >> > > > > > > > > > > > > So, I prefer to move configurations to >> > > > > >> > > > > > `CheckpointingOptions`, >> > > > > >> > > > > > > > WDYT? >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Yes, that's a very good point. Moving to >> > > > > >> > > > > > > > > > > > `CheckpointingOptions`(flink-core) makes >> > > sense. >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > @Lijie >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > How about >> > > > > >> > > > > > > > > > > > > state.savepoints.dir -> >> > > > > >> > > > > execution.checkpointing.savepoint.dir >> > > > > >> > > > > > > > > > > > > state.checkpoints.dir -> >> > > > > >> > > > > > execution.checkpointing.checkpoint.dir >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Actually, I think the >> > > `checkpointing.checkpoint` >> > > > > may >> > > > > >> > > cause >> > > > > >> > > > > some >> > > > > >> > > > > > > > > > > confusion. >> > > > > >> > > > > > > > > > > > But I'm ok if others agree. >> > > > > >> > > > > > > > > > > > I'm wondering if >> > > > > >> > `execution.checkpointing.savepoint-dir` >> > > > > >> > > > > would >> > > > > >> > > > > > be >> > > > > >> > > > > > > > better. >> > > > > >> > > > > > > > > > > > WDYT? >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > 2. We changed the >> > > > > >> execution.checkpointing.local-copy' >> > > > > >> > to >> > > > > >> > > > > > > > > > > > > >> > > 'execution.checkpointing.local-copy.enabled'. >> > > > > >> Should >> > > > > >> > we >> > > > > >> > > > > also >> > > > > >> > > > > > > add >> > > > > >> > > > > > > > > > > > "enabled" >> > > > > >> > > > > > > > > > > > > suffix for other boolean type >> > configuration >> > > > > >> options ? >> > > > > >> > > For >> > > > > >> > > > > > > > example, >> > > > > >> > > > > > > > > > > > > execution.checkpointing.incremental -> >> > > > > >> > > > > > > > > > > > > >> > execution.checkpointing.incremental.enabled >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > Actually, the incremental cp is something >> > like >> > > > > >> > choosing a >> > > > > >> > > > > mode >> > > > > >> > > > > > > for >> > > > > >> > > > > > > > doing >> > > > > >> > > > > > > > > > > > checkpoint instead of enabling a >> function. >> > So >> > > I >> > > > > >> think >> > > > > >> > an >> > > > > >> > > > > > > > enumeration >> > > > > >> > > > > > > > > > > option >> > > > > >> > > > > > > > > > > > `execution.checkpointing.mode` which can >> be >> > > 'full' >> > > > > >> > > > (default) >> > > > > >> > > > > or >> > > > > >> > > > > > > > > > > > 'incremental' would be better, WDYT? >> > > > > >> > > > > > > > > > > > And @Rui Fan @Yanfei What do you think >> about >> > > this? >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > On Tue, Dec 26, 2023 at 5:15 PM Lijie >> Wang < >> > > > > >> > > > > > > > wangdachui9...@gmail.com> >> > > > > >> > > > > > > > > > > > wrote: >> > > > > >> > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Thanks for driving the discussion. >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > 1. >> > > > > >> > > > > > > > > > > > > >> But I'm not so sure since there is >> only >> > > one >> > > > > >> > > > > > > savepoint-related >> > > > > >> > > > > > > > > > > option. >> > > > > >> > > > > > > > > > > > > Maybe someone else could share some >> > thoughts >> > > > > here. >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > How about >> > > > > >> > > > > > > > > > > > > state.savepoints.dir -> >> > > > > >> > > > > execution.checkpointing.savepoint.dir >> > > > > >> > > > > > > > > > > > > state.checkpoints.dir -> >> > > > > >> > > > > > execution.checkpointing.checkpoint.dir >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > 2. We changed the >> > > > > >> execution.checkpointing.local-copy' >> > > > > >> > > to >> > > > > >> > > > > > > > > > > > > >> > > 'execution.checkpointing.local-copy.enabled'. >> > > > > >> Should >> > > > > >> > we >> > > > > >> > > > > also >> > > > > >> > > > > > > add >> > > > > >> > > > > > > > > > > > "enabled" >> > > > > >> > > > > > > > > > > > > suffix for other boolean type >> > configuration >> > > > > >> options ? >> > > > > >> > > For >> > > > > >> > > > > > > > example, >> > > > > >> > > > > > > > > > > > > execution.checkpointing.incremental -> >> > > > > >> > > > > > > > > > > > > >> > execution.checkpointing.incremental.enabled >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > In this way, the naming style of >> > > configuration >> > > > > >> > options >> > > > > >> > > is >> > > > > >> > > > > > > > unified, and >> > > > > >> > > > > > > > > > > it >> > > > > >> > > > > > > > > > > > > can avoid potential similar problems >> (for >> > > > > >> example, we >> > > > > >> > > may >> > > > > >> > > > > > need >> > > > > >> > > > > > > > to add >> > > > > >> > > > > > > > > > > > more >> > > > > >> > > > > > > > > > > > > options for incremental checkpoint in >> the >> > > > > future). >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > > > Lijie >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > Yanfei Lei <fredia...@gmail.com> >> > > 于2023年12月26日周二 >> > > > > >> > > 12:05写道: >> > > > > >> > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Thank you for creating the FLIP and >> > > starting >> > > > > the >> > > > > >> > > > > > discussion. >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > The current arrangement of these >> options >> > > is >> > > > > >> indeed >> > > > > >> > > > > somewhat >> > > > > >> > > > > > > > > > > haphazard, >> > > > > >> > > > > > > > > > > > > > and the new arrangement looks much >> > > better. I >> > > > > >> have >> > > > > >> > > some >> > > > > >> > > > > > > > questions >> > > > > >> > > > > > > > > > > about >> > > > > >> > > > > > > > > > > > > > the arrangement of some new >> > configuration >> > > > > >> options: >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > 1. For some state backends that do >> not >> > > support >> > > > > >> > > > > incremental >> > > > > >> > > > > > > > > > > checkpoint, >> > > > > >> > > > > > > > > > > > > > how does the >> > > > > >> > > execution.checkpointing.incrementaloption >> > > > > >> > > > > take >> > > > > >> > > > > > > > effect? >> > > > > >> > > > > > > > > > > Or >> > > > > >> > > > > > > > > > > > > > is it better to put incremental under >> > > > > >> > > > > > > > state.backend.xxx.incremental? >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > 2. I'm a little worried that putting >> all >> > > > > >> > > configurations >> > > > > >> > > > > > into >> > > > > >> > > > > > > > > > > > > > `ExecutionCheckpointingOptions` will >> > > introduce >> > > > > >> some >> > > > > >> > > > > > > dependency >> > > > > >> > > > > > > > > > > > > > problems. Some options would be used >> by >> > > > > >> > flink-runtime >> > > > > >> > > > > > module, >> > > > > >> > > > > > > > but >> > > > > >> > > > > > > > > > > > > > flink-runtime should not depend on >> > > > > >> > > > flink-streaming-java. >> > > > > >> > > > > > e.g. >> > > > > >> > > > > > > > > > > > > > FLINK-28286[1]. >> > > > > >> > > > > > > > > > > > > > So, I prefer to move configurations >> to >> > > > > >> > > > > > > `CheckpointingOptions`, >> > > > > >> > > > > > > > WDYT? >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > [1] >> > > > > >> > > https://issues.apache.org/jira/browse/FLINK-28286 >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > -- >> > > > > >> > > > > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > > > > Yanfei >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > Zakelly Lan <zakelly....@gmail.com> >> > > > > >> 于2023年12月25日周一 >> > > > > >> > > > > > 21:14写道: >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Hi Rui Fan and Junrui, >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Thanks for the reminder! I agree to >> > > change >> > > > > the >> > > > > >> > > > > > > > > > > > > > > >> 'execution.checkpointing.local-copy' >> > to >> > > > > >> > > > > > > > > > > > > > > >> > > > > 'execution.checkpointing.local-copy.enabled'. >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > And for other suggestions Rui >> > proposed: >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > 1. How about >> > > > > >> execution.checkpointing.storage.type >> > > > > >> > > > > instead >> > > > > >> > > > > > > > > > > > > > > > of >> execution.checkpointing.storage? >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Ah, I missed something here. >> Actually >> > I >> > > > > >> suggest >> > > > > >> > we >> > > > > >> > > > > could >> > > > > >> > > > > > > > merge the >> > > > > >> > > > > > > > > > > > > > current >> > > > > >> > > > > > > > > > > > > > > 'state.checkpoints.dir' and >> > > > > >> > > > 'state.checkpoint-storage' >> > > > > >> > > > > > into >> > > > > >> > > > > > > > one URI >> > > > > >> > > > > > > > > > > > > > > configuration named >> > > > > >> > 'execution.checkpointing.dir'. >> > > > > >> > > > > WDYT? >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > 3. >> > > execution.checkpointing.savepoint.dir is >> > > > > a >> > > > > >> > > little >> > > > > >> > > > > > weird. >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Yes, I think it is better to make >> > > > > 'savepoint' >> > > > > >> and >> > > > > >> > > > > > > > 'checkpoint' the >> > > > > >> > > > > > > > > > > > same >> > > > > >> > > > > > > > > > > > > > > level. But I'm not so sure since >> there >> > > is >> > > > > only >> > > > > >> > one >> > > > > >> > > > > > > > > > > savepoint-related >> > > > > >> > > > > > > > > > > > > > > option. Maybe someone else could >> share >> > > some >> > > > > >> > > thoughts >> > > > > >> > > > > > here. >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > 4. How about >> > > execution.recovery.claim-mode >> > > > > >> > instead >> > > > > >> > > of >> > > > > >> > > > > > > > > > > > > > > > execution.recovery.mode? >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Agreed. That's more accurate. >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Many thanks for your suggestions! >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > > > > > Zakelly >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > On Mon, Dec 25, 2023 at 8:18 PM >> Junrui >> > > Lee < >> > > > > >> > > > > > > > jrlee....@gmail.com> >> > > > > >> > > > > > > > > > > > > wrote: >> > > > > >> > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Thanks for driving this. I agree >> > that >> > > the >> > > > > >> > > proposed >> > > > > >> > > > > > > > restructuring >> > > > > >> > > > > > > > > > > of >> > > > > >> > > > > > > > > > > > > the >> > > > > >> > > > > > > > > > > > > > > > configuration options is largely >> > > positive. >> > > > > >> It >> > > > > >> > > will >> > > > > >> > > > > make >> > > > > >> > > > > > > > > > > > understanding >> > > > > >> > > > > > > > > > > > > > and >> > > > > >> > > > > > > > > > > > > > > > working with Flink configurations >> > more >> > > > > >> > intuitive. >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Most of the proposed changes look >> > > great. >> > > > > >> Just a >> > > > > >> > > > > > heads-up, >> > > > > >> > > > > > > > as Rui >> > > > > >> > > > > > > > > > > > Fan >> > > > > >> > > > > > > > > > > > > > > > mentioned, Flink currently >> requires >> > > that >> > > > > no >> > > > > >> > > > > > > configOption's >> > > > > >> > > > > > > > key be >> > > > > >> > > > > > > > > > > > the >> > > > > >> > > > > > > > > > > > > > > > prefix of another to avoid issues >> > > when we >> > > > > >> > > > eventually >> > > > > >> > > > > > > adopt >> > > > > >> > > > > > > > a >> > > > > >> > > > > > > > > > > > standard >> > > > > >> > > > > > > > > > > > > > YAML >> > > > > >> > > > > > > > > > > > > > > > parser, as detailed in >> FLINK-29372 ( >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > https://issues.apache.org/jira/browse/FLINK-29372 >> > > > > >> > > > ). >> > > > > >> > > > > > > > Therefore, >> > > > > >> > > > > > > > > > > > it's >> > > > > >> > > > > > > > > > > > > > better >> > > > > >> > > > > > > > > > > > > > > > to change the key >> > > > > >> > > > > 'execution.checkpointing.local-copy' >> > > > > >> > > > > > > > because it >> > > > > >> > > > > > > > > > > > > > serves as >> > > > > >> > > > > > > > > > > > > > > > a prefix to the key >> > > > > >> > > > > > > > 'execution.checkpointing.local-copy.dir'. >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Best regards, >> > > > > >> > > > > > > > > > > > > > > > Junrui >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > Rui Fan <1996fan...@gmail.com> >> > > > > >> 于2023年12月25日周一 >> > > > > >> > > > > 19:11写道: >> > > > > >> > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > Hi Zakelly, >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > Thank you for driving this >> > proposal! >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > Overall good for me. I have >> some >> > > > > questions >> > > > > >> > > about >> > > > > >> > > > > > these >> > > > > >> > > > > > > > names. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > 1. How about >> > > > > >> > > execution.checkpointing.storage.type >> > > > > >> > > > > > > > instead of >> > > > > >> > > > > > > > > > > > > > > > > >> execution.checkpointing.storage? >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > It's similar to >> > state.backend.type. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > 2. How about >> > > > > >> > > > > > execution.checkpointing.local-copy.enabled >> > > > > >> > > > > > > > instead >> > > > > >> > > > > > > > > > > > of >> > > > > >> > > > > > > > > > > > > > > > > >> > execution.checkpointing.local-copy? >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > You added a new option: >> > > > > >> > > > > > > > execution.checkpointing.local-copy.dir. >> > > > > >> > > > > > > > > > > > > > > > > IIUC, one option name >> shouldn't be >> > > the >> > > > > >> prefix >> > > > > >> > > of >> > > > > >> > > > > > other >> > > > > >> > > > > > > > options. >> > > > > >> > > > > > > > > > > > > > > > > If you add a new option >> > > > > >> > > > > > > > execution.checkpointing.local-copy, >> > > > > >> > > > > > > > > > > > > > > > > flink CI will fail directly. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > 3. >> > > execution.checkpointing.savepoint.dir >> > > > > >> is a >> > > > > >> > > > > little >> > > > > >> > > > > > > > weird. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > For old options: >> > > state.savepoints.dir >> > > > > and >> > > > > >> > > > > > > > > > > state.checkpoints.dir, >> > > > > >> > > > > > > > > > > > > > > > > the savepoint and checkpoint >> are >> > the >> > > > > same >> > > > > >> > > level. >> > > > > >> > > > It >> > > > > >> > > > > > > means >> > > > > >> > > > > > > > > > > > > > > > > it's a checkpoint or savepoint. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > The new option >> > > > > >> execution.checkpointing.dir is >> > > > > >> > > > fine >> > > > > >> > > > > > for >> > > > > >> > > > > > > > me. >> > > > > >> > > > > > > > > > > > > > > > > However, >> > > > > >> > execution.checkpointing.savepoint.dir >> > > > > >> > > > is a >> > > > > >> > > > > > > > little >> > > > > >> > > > > > > > > > > weird. >> > > > > >> > > > > > > > > > > > > > > > > I don't know which name is >> better >> > > now. >> > > > > >> Let us >> > > > > >> > > > think >> > > > > >> > > > > > > > about it >> > > > > >> > > > > > > > > > > > more. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > 4. How about >> > > > > execution.recovery.claim-mode >> > > > > >> > > > instead >> > > > > >> > > > > of >> > > > > >> > > > > > > > > > > > > > > > > execution.recovery.mode? >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > The meaning of mode is too >> broad. >> > > The >> > > > > >> > > claim-mode >> > > > > >> > > > > may >> > > > > >> > > > > > > > > > > > > > > > > be more accurate for users. >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > WDYT? >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > > > > > > > Rui >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > On Mon, Dec 25, 2023 at 5:14 PM >> > > Zakelly >> > > > > >> Lan < >> > > > > >> > > > > > > > > > > > zakelly....@gmail.com >> > > > > >> > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > wrote: >> > > > > >> > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Hi devs, >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > I'd like to start a >> discussion >> > on >> > > > > >> FLIP-406: >> > > > > >> > > > > > > Reorganize >> > > > > >> > > > > > > > State >> > > > > >> > > > > > > > > > > & >> > > > > >> > > > > > > > > > > > > > > > > > Checkpointing & Recovery >> > > > > >> Configuration[1]. >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Currently, the configuration >> > > options >> > > > > >> > > pertaining >> > > > > >> > > > > to >> > > > > >> > > > > > > > > > > > checkpointing, >> > > > > >> > > > > > > > > > > > > > > > > recovery, >> > > > > >> > > > > > > > > > > > > > > > > > and state management are >> > primarily >> > > > > >> grouped >> > > > > >> > > > under >> > > > > >> > > > > > the >> > > > > >> > > > > > > > > > > following >> > > > > >> > > > > > > > > > > > > > > > prefixes: >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > - state.backend.* : >> > > configurations >> > > > > >> > related >> > > > > >> > > > to >> > > > > >> > > > > > > state >> > > > > >> > > > > > > > > > > > accessing >> > > > > >> > > > > > > > > > > > > > and >> > > > > >> > > > > > > > > > > > > > > > > > checkpointing, as well as >> > > specific >> > > > > >> > options >> > > > > >> > > > for >> > > > > >> > > > > > > > individual >> > > > > >> > > > > > > > > > > > > state >> > > > > >> > > > > > > > > > > > > > > > > backends >> > > > > >> > > > > > > > > > > > > > > > > > - >> execution.checkpointing.* : >> > > > > >> > > configurations >> > > > > >> > > > > > > > associated >> > > > > >> > > > > > > > > > > with >> > > > > >> > > > > > > > > > > > > > > > > checkpoint >> > > > > >> > > > > > > > > > > > > > > > > > execution and recovery >> > > > > >> > > > > > > > > > > > > > > > > > - execution.savepoint.*: >> > > > > >> configurations >> > > > > >> > > for >> > > > > >> > > > > > > > recovery from >> > > > > >> > > > > > > > > > > > > > savepoint >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > In addition, there are >> several >> > > > > >> individual >> > > > > >> > > > options >> > > > > >> > > > > > > such >> > > > > >> > > > > > > > as ' >> > > > > >> > > > > > > > > > > > > > > > > > *state.checkpoint-storage*' >> and >> > > > > >> > > > > > > > '*state.checkpoints.dir*' >> > > > > >> > > > > > > > > > > that >> > > > > >> > > > > > > > > > > > > fall >> > > > > >> > > > > > > > > > > > > > > > > outside >> > > > > >> > > > > > > > > > > > > > > > > > of these prefixes. The >> current >> > > > > >> arrangement >> > > > > >> > of >> > > > > >> > > > > these >> > > > > >> > > > > > > > options, >> > > > > >> > > > > > > > > > > > > which >> > > > > >> > > > > > > > > > > > > > span >> > > > > >> > > > > > > > > > > > > > > > > > multiple modules, is somewhat >> > > > > haphazard >> > > > > >> and >> > > > > >> > > > > lacks a >> > > > > >> > > > > > > > > > > systematic >> > > > > >> > > > > > > > > > > > > > > > structure. >> > > > > >> > > > > > > > > > > > > > > > > > For example, the options >> under >> > the >> > > > > >> > > > > > > > '*CheckpointingOptions*' >> > > > > >> > > > > > > > > > > > and ' >> > > > > >> > > > > > > > > > > > > > > > > > >> *ExecutionCheckpointingOptions*' >> > > are >> > > > > >> > related >> > > > > >> > > > and >> > > > > >> > > > > > have >> > > > > >> > > > > > > > no >> > > > > >> > > > > > > > > > > clear >> > > > > >> > > > > > > > > > > > > > > > boundaries >> > > > > >> > > > > > > > > > > > > > > > > > from the user's perspective, >> but >> > > there >> > > > > >> is >> > > > > >> > no >> > > > > >> > > > > > unified >> > > > > >> > > > > > > > prefix >> > > > > >> > > > > > > > > > > for >> > > > > >> > > > > > > > > > > > > > them. >> > > > > >> > > > > > > > > > > > > > > > > With >> > > > > >> > > > > > > > > > > > > > > > > > the upcoming release of Flink >> > > 2.0, we >> > > > > >> have >> > > > > >> > an >> > > > > >> > > > > > > excellent >> > > > > >> > > > > > > > > > > > > > opportunity to >> > > > > >> > > > > > > > > > > > > > > > > > overhaul and restructure the >> > > > > >> configurations >> > > > > >> > > > > related >> > > > > >> > > > > > > to >> > > > > >> > > > > > > > > > > > > > checkpointing, >> > > > > >> > > > > > > > > > > > > > > > > > recovery, and state >> management. >> > > This >> > > > > >> FLIP >> > > > > >> > > > > proposes >> > > > > >> > > > > > to >> > > > > >> > > > > > > > > > > > reorganize >> > > > > >> > > > > > > > > > > > > > these >> > > > > >> > > > > > > > > > > > > > > > > > settings, making it more >> > coherent >> > > by >> > > > > >> > module, >> > > > > >> > > > > which >> > > > > >> > > > > > > > would >> > > > > >> > > > > > > > > > > > > > significantly >> > > > > >> > > > > > > > > > > > > > > > > > lower the barriers for >> > > understanding >> > > > > and >> > > > > >> > > reduce >> > > > > >> > > > > the >> > > > > >> > > > > > > > > > > development >> > > > > >> > > > > > > > > > > > > > costs >> > > > > >> > > > > > > > > > > > > > > > > > moving forward. >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Looking forward to hearing >> from >> > > youhttps://cwiki.apache.org/confluence/pages/viewpage.action?pageId=284789560 >> > > > > >> > > > > > > > > > > > > > > > > > >> > > > > >> > > > > > > > > > > > > > > > > > Best, >> > > > > >> > > > > > > > > > > > > > > > > > Zakellyest, >> > > > > >> Hangxiang. >> > > > > >> >> > > > > > >> > > > > >> > > > >> > > > >> > > > -- >> > > > Best, >> > > > Hangxiang. >> > > >> > > >> > >> >