Chesnay,

Seems like you are suggesting that the Adaptive scheduler does everything
the standard scheduler does and more.

I am clearly not an expert on this topic but can you please explain why the
AdaptiveScheduler is not the default scheduler?
If it can do everything, why do we even have 2 schedulers? Why not simply
drop the "old" one?

That would probably clear up all confusionsthen :)

Gyula

On Thu, Jan 26, 2023 at 6:23 PM Chesnay Schepler <ches...@apache.org> wrote:

> There's the default and reactive mode; nothing else.
> At it's core they are the same thing; reactive mode just cranks up the
> desired parallelism to infinity and enforces certain assumptions (e.g.,
> no active resource management).
>
> The advantage is that the adaptive scheduler can run jobs while not
> sufficient resources are available, and scale things up again once they
> are available.
> This is it's core functionality, but we always intended to extend it
> such that users can modify the parallelism at runtime as well.
> And since the AS can already rescale jobs (and was purpose-built with
> that functionality in mind), this is just a matter of exposing an API
> for it. Everything else is already there.
>
> As a concrete use-case, let's say you have an SLA that says jobs must
> not be down longer than X seconds, and a TM just crashed.
> If you can absolutely guarantee that your k8s cluster can provision a
> new TM within X seconds, no matter what cruel reality has in store for
> you, than you /may/ not need it.
> If you can't, well then here's a use-case for you.
>
>  > Last time I looked they implemented the same interface and the same
> base class. Of course, their behavior is quite different.
>
> They never shared a base class since day 1. Are you maybe mixing up the
> AdaptiveScheduler and AdaptiveBatchScheduler?
>
> As for FLINK-30773, I think that should be covered.
>
> On 26/01/2023 17:10, Maximilian Michels wrote:
> > Thanks for the explanation. If not for the "reactive mode", what is
> > the advantage of the adaptive scheduler? What other modes does it
> > support?
> >
> >> Apart from implementing the same interface the implementations of the
> adaptive and default schedulers are separate.
> > Last time I looked they implemented the same interface and the same
> > base class. Of course, their behavior is quite different.
> >
> > I'm still very interested in learning about the future FLIPs
> > mentioned. Based on the replies, I'm assuming that they will support
> > the changes required for
> > https://issues.apache.org/jira/browse/FLINK-30773, or at least provide
> > the basis for implementing them.
> >
> > -Max
> >
> > On Thu, Jan 26, 2023 at 4:57 PM Chesnay Schepler<ches...@apache.org>
> wrote:
> >> On 26/01/2023 16:18, Maximilian Michels wrote:
> >>
> >> I see slightly different goals for the standard and the adaptive
> >> scheduler. The adaptive scheduler's goal is to adapt the Flink job
> >> according to the available resources.
> >>
> >> This is really a misconception that we just have to stomp out.
> >>
> >> This statement only applies to reactive mode, a special mode in which
> the adaptive scheduler (AS) can run in where active resource management is
> not supported since requesting infinite resources from k8s doesn't really
> make sense.
> >>
> >> The AS itself can work perfectly fine with active resource management,
> and has no effect on how the RM talks to k8s. It can just keep the job
> running in cases where less than desired (==user-provided parallelism)
> resources are provided by k8s (possibly temporarily).
> >>
> >> On 26/01/2023 16:18, Maximilian Michels wrote:
> >>
> >> After
> >> all, both schedulers share the same super class
> >>
> >> Apart from implementing the same interface the implementations of the
> adaptive and default schedulers are separate.
>
>

Reply via email to