Re: Reworking the Rescale API

Chesnay Schepler Fri, 27 Jan 2023 00:59:04 -0800

The adaptive scheduler only supports streaming jobs. That's the biggestlimitation that probably won't be fixed anytime soon.The goal was though to make the adaptive scheduler the default forstreaming jobs eventually.it was very much meant as a better version of the default scheduler forstreaming jobs.


On 26/01/2023 19:06, David Morávek wrote:

Hi Gyula,

can you please explain why the AdaptiveScheduler is not the default
scheduler?


There are still some smaller bits missing. As far as I know, the missing
parts are:

1) Local recovery (reusing the already downloaded state files after restart
/ rescale)
2) Support for fine-grained resource management
3) Support for the session cluster (Chesnay will be submitting a FLIP for
this soon)

We're looking into addressing all of these limitations in the short term.

Personally, I'd love to start a discussion about making transitioning the
AdaptiveScheduler into a default one after those limitations are fixed.
Being able to eventually deprecate and remove the DefaultScheduler would
simplify the code-base by a lot since there are many adapters between new
and old interfaces (eg. SlotPool-related interfaces).

Best,
D.

On Thu, Jan 26, 2023 at 6:27 PM Gyula Fóra <[email protected]> wrote:

Chesnay,

Seems like you are suggesting that the Adaptive scheduler does everything
the standard scheduler does and more.

I am clearly not an expert on this topic but can you please explain why the
AdaptiveScheduler is not the default scheduler?
If it can do everything, why do we even have 2 schedulers? Why not simply
drop the "old" one?

That would probably clear up all confusionsthen :)

Gyula

On Thu, Jan 26, 2023 at 6:23 PM Chesnay Schepler <[email protected]>
wrote:

There's the default and reactive mode; nothing else.
At it's core they are the same thing; reactive mode just cranks up the
desired parallelism to infinity and enforces certain assumptions (e.g.,
no active resource management).

The advantage is that the adaptive scheduler can run jobs while not
sufficient resources are available, and scale things up again once they
are available.
This is it's core functionality, but we always intended to extend it
such that users can modify the parallelism at runtime as well.
And since the AS can already rescale jobs (and was purpose-built with
that functionality in mind), this is just a matter of exposing an API
for it. Everything else is already there.

As a concrete use-case, let's say you have an SLA that says jobs must
not be down longer than X seconds, and a TM just crashed.
If you can absolutely guarantee that your k8s cluster can provision a
new TM within X seconds, no matter what cruel reality has in store for
you, than you /may/ not need it.
If you can't, well then here's a use-case for you.

  > Last time I looked they implemented the same interface and the same
base class. Of course, their behavior is quite different.

They never shared a base class since day 1. Are you maybe mixing up the
AdaptiveScheduler and AdaptiveBatchScheduler?

As for FLINK-30773, I think that should be covered.

On 26/01/2023 17:10, Maximilian Michels wrote:

Thanks for the explanation. If not for the "reactive mode", what is
the advantage of the adaptive scheduler? What other modes does it
support?

Apart from implementing the same interface the implementations of the

adaptive and default schedulers are separate.

Last time I looked they implemented the same interface and the same
base class. Of course, their behavior is quite different.

I'm still very interested in learning about the future FLIPs
mentioned. Based on the replies, I'm assuming that they will support
the changes required for
https://issues.apache.org/jira/browse/FLINK-30773, or at least provide
the basis for implementing them.

-Max

On Thu, Jan 26, 2023 at 4:57 PM Chesnay Schepler<[email protected]>

wrote:

On 26/01/2023 16:18, Maximilian Michels wrote:

I see slightly different goals for the standard and the adaptive
scheduler. The adaptive scheduler's goal is to adapt the Flink job
according to the available resources.

This is really a misconception that we just have to stomp out.

This statement only applies to reactive mode, a special mode in which

the adaptive scheduler (AS) can run in where active resource management

is

not supported since requesting infinite resources from k8s doesn't really
make sense.

The AS itself can work perfectly fine with active resource management,

and has no effect on how the RM talks to k8s. It can just keep the job
running in cases where less than desired (==user-provided parallelism)
resources are provided by k8s (possibly temporarily).

On 26/01/2023 16:18, Maximilian Michels wrote:

After
all, both schedulers share the same super class

Apart from implementing the same interface the implementations of the

adaptive and default schedulers are separate.

Re: Reworking the Rescale API

Reply via email to