Hi Matthias, Thanks for your proposal! I have a few questions:
1. Is it possible a change event observed right after a complete checkpoint (or within a specific short time after a checkpoint) that triggers a rescale immediately? Sometimes the checkpoint interval is huge and it is better to rescale immediately. 2. Should we introduce `CheckpointLifecycleListener` instead of reusing `CheckpointListener`? Is `CheckpointListener` enough for this scenario? Best, Zakelly On Wed, Jun 5, 2024 at 3:02 PM Matthias Pohl <map...@apache.org> wrote: > Hi ConradJam, > thanks for your response. > > The CheckpointStatsTracker gets notified about the checkpoint completion > after the checkpoint is finalized, i.e. all its data is persisted and the > metadata is written to the CompletedCheckpointStore. At this moment, the > checkpoint is considered for restoring a job and, therefore, becomes > available for restarts. This workflow also applies to unaligned > checkpoints. But I see how this context might be helpful for understanding > the change. I will add it to the FLIP. So far, I don't see a reason > to disable the feature for unaligned checkpoints. Do you see other issues > that might make it necessary to disable this feature for this type of > checkpoints? > > Can you elaborate a bit more what you mean by "checkpoints that do not > check it"? I do not fully understand what you are referring to with "it" > here. > > Best, > Matthias > > On Wed, Jun 5, 2024 at 4:46 AM ConradJam <jam.gz...@gmail.com> wrote: > > > I have a few questions: > > Unaligned checkpoints Do we need to enable this feature? Whether this > > feature should be disabled for checkpoints that do not check it > > > > Matthias Pohl <map...@apache.org> 于2024年6月4日周二 18:03写道: > > > > > Hi everyone, > > > I'd like to discuss FLIP-461 [1]. The FLIP proposes the synchronization > > of > > > rescaling and the completion of checkpoints. The idea is to reduce the > > > amount of data that needs to be processed after rescaling happened. A > > more > > > detailed motivation can be found in FLIP-461. > > > > > > I'm looking forward to feedback and suggestions. > > > > > > Best, > > > Matthias > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing > > > > > > > > > -- > > Best > > > > ConradJam > > >