Re: [DISCUSS] FLIP-56: Dynamic Slot Allocation

Xintong Song Tue, 03 Mar 2020 01:47:57 -0800

Hi Shaoxun,

You're right, that supporting end-to-end fine grained resource management
is a huge plan, and FLIP-56 is only one step towards it.

Regarding your questions:

First, does "specific request" for the slots mean the requesting slot
> profile contains detailed information about memory and cpu? Is it done when
> it scheduled the execution graph?

Yes, that means slot requests contains detailed information about how many
cpu/memory is needed.

And how does a job manager determine to ask how much memory?
>

A job graph should contains how many resources each vertex/task needs, and
the JobMaster knows how many resource to request for each slot by adding up
the resources of tasks it plans to deploy in the slot.
Regarding how to initially set the resources in the job graph, there could
be various ways.

   - We can expose interface to let the user decide how many resources each
   operator needs, like what you can do currently in DataStream API. But we
   probably want to change that later for better usability.
   - The compiler can set it automatically, according to the operator type
   and some configured default values for each type.

Anyway, the fine grained resource management is an advanced feature,
targeting expert users who knows well how many resources their jobs/tasks
need. There are also various efforts trying to make the task-level fine
grained resource configuration automatically, which are not in the scope of
this FLIP.

Second, will the dynamic allocation create the fragments?

Yes, it will. You can also look at FLINK-14106, where we try to make the
slot allocation strategy pluggable, so we can have different strategies for
different use cases. E.g., we can have a strategy to start TMs only when
slot requests are received, with the exact resources requested by the
slots. That avoids fragments, at the cost of longer scheduling time due to
starting TMs late, which should be suitable for long running streaming
jobs. We can also have another strategy that starts a configured amount of
TMs before receiving any slot request, with predefined resources. The
benefit is that job gets scheduled immediately, and the cost is potential
fragments, which I believe is more suitable for short batch queries.

Thank you~

Xintong Song

On Tue, Mar 3, 2020 at 3:00 PM shaoxun <838492...@qq.com> wrote:

> Hi Xintong, it it a huge plan to carry on. And I get a few questions about
> the details.
>
> First, does "specific request" for the slots mean the requesting slot
> profile contains detailed information about memory and cpu? And how does a
> job manager determine to ask how much memory? Is it done when
>  it scheduled the execution graph? Or maybe I miss something here.
>
> Second, will the dynamic allocation create the fragments? For example, if a
> task executor has 100mb memory left and maybe other tasks all ask for a
> larger memory size.
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>

Re: [DISCUSS] FLIP-56: Dynamic Slot Allocation

Reply via email to