Hi, Xintong Thanks to propose this FLIP. The general design looks good to me, +1 for this feature.
Since slots in the same task executor could have different resource profile, we will meet resource fragment problem. Think about this case: - request A want 1G memory while request B & C want 0.5G memory - There are two task executors T1 & T2 with 1G and 0.5G free memory respectively If B come first and we cut a slot from T1 for B, A must wait for the free resource from other task. But A could have been scheduled immediately if we cut a slot from T2 for B. The logic of findMatchingSlot now become finding a task executor which has enough resource and then cut a slot from it. Current method could be seen as "First-fit strategy", which works well in general but sometimes could not be the optimization method. Actually, this problem could be abstracted as "Bin Packing Problem"[1]. Here are some common approximate algorithms: - First fit - Next fit - Best fit But it become multi-dimensional bin packing problem if we take CPU into account. It hard to define which one is best fit now. Some research addressed this problem, such like Tetris[2]. Here are some thinking about it: 1. We could make the strategy of finding matching task executor pluginable. Let user to config the best strategy in their scenario. 2. We could support batch request interface in RM, because we have opportunities to optimize if we have more information. If we know the A, B, C at the same time, we could always make the best decision. [1] http://www.or.deis.unibo.it/kp/Chapter8.pdf [2] https://www.cs.cmu.edu/~xia/resources/Documents/grandl_sigcomm14.pdf Best, Yangze Guo On Thu, Aug 15, 2019 at 10:40 PM Xintong Song <tonysong...@gmail.com> wrote: > > Hi everyone, > > We would like to start a discussion thread on "FLIP-53: Fine Grained > Resource Management"[1], where we propose how to improve Flink resource > management and scheduling. > > This FLIP mainly discusses the following issues. > > - How to support tasks with fine grained resource requirements. > - How to unify resource management for jobs with / without fine grained > resource requirements. > - How to unify resource management for streaming / batch jobs. > > Key changes proposed in the FLIP are as follows. > > - Unify memory management for operators with / without fine grained > resource requirements by applying a fraction based quota mechanism. > - Unify resource scheduling for streaming and batch jobs by setting slot > sharing groups for pipelined regions during compiling stage. > - Dynamically allocate slots from task executors' available resources. > > Please find more details in the FLIP wiki document [1]. Looking forward to > your feedbacks. > > Thank you~ > > Xintong Song > > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Resource+Management