Hi Shaoxun, You're right, that supporting end-to-end fine grained resource management is a huge plan, and FLIP-56 is only one step towards it.
Regarding your questions: First, does "specific request" for the slots mean the requesting slot > profile contains detailed information about memory and cpu? Is it done when > it scheduled the execution graph? Yes, that means slot requests contains detailed information about how many cpu/memory is needed. And how does a job manager determine to ask how much memory? > A job graph should contains how many resources each vertex/task needs, and the JobMaster knows how many resource to request for each slot by adding up the resources of tasks it plans to deploy in the slot. Regarding how to initially set the resources in the job graph, there could be various ways. - We can expose interface to let the user decide how many resources each operator needs, like what you can do currently in DataStream API. But we probably want to change that later for better usability. - The compiler can set it automatically, according to the operator type and some configured default values for each type. Anyway, the fine grained resource management is an advanced feature, targeting expert users who knows well how many resources their jobs/tasks need. There are also various efforts trying to make the task-level fine grained resource configuration automatically, which are not in the scope of this FLIP. Second, will the dynamic allocation create the fragments? Yes, it will. You can also look at FLINK-14106, where we try to make the slot allocation strategy pluggable, so we can have different strategies for different use cases. E.g., we can have a strategy to start TMs only when slot requests are received, with the exact resources requested by the slots. That avoids fragments, at the cost of longer scheduling time due to starting TMs late, which should be suitable for long running streaming jobs. We can also have another strategy that starts a configured amount of TMs before receiving any slot request, with predefined resources. The benefit is that job gets scheduled immediately, and the cost is potential fragments, which I believe is more suitable for short batch queries. Thank you~ Xintong Song On Tue, Mar 3, 2020 at 3:00 PM shaoxun <838492...@qq.com> wrote: > Hi Xintong, it it a huge plan to carry on. And I get a few questions about > the details. > > First, does "specific request" for the slots mean the requesting slot > profile contains detailed information about memory and cpu? And how does a > job manager determine to ask how much memory? Is it done when > it scheduled the execution graph? Or maybe I miss something here. > > Second, will the dynamic allocation create the fragments? For example, if a > task executor has 100mb memory left and maybe other tasks all ask for a > larger memory size. > > > > -- > Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ >