Hi Zakelly,
thanks for your reply. See my inlined responses below:

On Wed, Jun 5, 2024 at 10:26 AM Zakelly Lan <zakelly....@gmail.com> wrote:

> Hi Matthias,
>
> Thanks for your proposal! I have a few questions:
>
> 1. Is it possible a change event observed right after a complete checkpoint
> (or within a specific short time after a checkpoint) that triggers a
> rescale immediately? Sometimes the checkpoint interval is huge and it is
> better to rescale immediately.
>

That's something that could be considered as another optimization. I would
consider this as a possible follow-up. My concern here is that we'd make
the rescaling configuration even more complicated by introducing yet
another parameter.


> 2. Should we introduce `CheckpointLifecycleListener` instead of reusing
> `CheckpointListener`? Is `CheckpointListener` enough for this scenario?
>

Good point, they are serving similar purposes. But I'm hesitant to use
CheckpointListener (which is a public interface) for this internal quite
narrowly scoped runtime-specific use case of FLIP-461.

It might be worth renaming the internal interface into something that
indicates its internal usage to avoid confusion.


> Best,
> Zakelly
>
> On Wed, Jun 5, 2024 at 3:02 PM Matthias Pohl <map...@apache.org> wrote:
>
> > Hi ConradJam,
> > thanks for your response.
> >
> > The CheckpointStatsTracker gets notified about the checkpoint completion
> > after the checkpoint is finalized, i.e. all its data is persisted and the
> > metadata is written to the CompletedCheckpointStore. At this moment, the
> > checkpoint is considered for restoring a job and, therefore, becomes
> > available for restarts. This workflow also applies to unaligned
> > checkpoints. But I see how this context might be helpful for
> understanding
> > the change. I will add it to the FLIP. So far, I don't see a reason
> > to disable the feature for unaligned checkpoints. Do you see other issues
> > that might make it necessary to disable this feature for this type of
> > checkpoints?
> >
> > Can you elaborate a bit more what you mean by "checkpoints that do not
> > check it"? I do not fully understand what you are referring to with "it"
> > here.
> >
> > Best,
> > Matthias
> >
> > On Wed, Jun 5, 2024 at 4:46 AM ConradJam <jam.gz...@gmail.com> wrote:
> >
> > > I have a few questions:
> > > Unaligned checkpoints Do we need to enable this feature? Whether this
> > > feature should be disabled for checkpoints that do not check it
> > >
> > > Matthias Pohl <map...@apache.org> 于2024年6月4日周二 18:03写道:
> > >
> > > > Hi everyone,
> > > > I'd like to discuss FLIP-461 [1]. The FLIP proposes the
> synchronization
> > > of
> > > > rescaling and the completion of checkpoints. The idea is to reduce
> the
> > > > amount of data that needs to be processed after rescaling happened. A
> > > more
> > > > detailed motivation can be found in FLIP-461.
> > > >
> > > > I'm looking forward to feedback and suggestions.
> > > >
> > > > Best,
> > > > Matthias
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing
> > > >
> > >
> > >
> > > --
> > > Best
> > >
> > > ConradJam
> > >
> >
>

Reply via email to