This sounds good to go ahead from my side. I like the approach that Becket suggested - in that case the core abstraction that everyone would need to understand would be "external resource allocation" and the "ResourceInfoProvider", and the GPU specific code would be a specific implementation only known to that component that allocates the external resource. That fits the separation of concerns well.
I also understand that it should not be over-engineered in the first version, so some simplification makes sense, and then gradually expand from there. So +1 to go ahead with what was suggested above (Xintong / Becket) from my side. On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <tonysong...@gmail.com> wrote: > Thanks for the comments, Stephan & Becket. > > @Stephan > > I see your concern, and I completely agree with you that we should first > think about the "library" / "plugin" / "extension" style if possible. > > If GPUs are sliced and assigned during scheduling, there may be reason, > > although it looks that it would belong to the slot then. Is that what we > > are doing here? > > > In the current proposal, we do not have the GPUs sliced and assigned to > slots, because it could be problematic without dynamic slot allocation. > E.g., the number of GPUs might not be evenly divisible by the number of > slots. > > I think it makes sense to eventually have the GPUs assigned to slots. Even > then, we might still need a TM level GPUManager (or ResourceProvider like > Becket suggested). For memory, in each slot we can simply request the > amount of memory, leaving it to JVM / OS to decide which memory (address) > should be assigned. For GPU, and potentially other resources like FPGA, we > need to explicitly specify which GPU (index) should be used. Therefore, we > need some component at the TM level to coordinate which slot uses which > GPU. > > IMO, unless we say Flink will not support slot-level GPU slicing at least > in the foreseeable future, I don't see a good way to avoid touching the TM > core. To that end, I think Becket's suggestion points to a good direction, > that supports more features (GPU, FPGA, etc.) with less coupling to the TM > core (only needs to understand the general interfaces). The detailed > implementation for specific resource types can even be encapsulated as a > library. > > @Becket > > Thanks for sharing your thought on the final state. Despite the details how > the interfaces should look like, I think this is a really good abstraction > for supporting general resource types. > > I'd like to further clarify that, the following three things are all that > the "Flink core" needs to understand. > > - The *amount* of resource, for scheduling. Actually, we already have > the Resource class in ResourceProfile and ResourceSpec for extended > resource. It's just not really used. > - The *info*, that Flink provides to the operators / user codes. > - The *provider*, which generates the info based on the amount. > > The "core" does not need to understand the specific implementation details > of the above three. They can even be implemented in a 3rd-party library. > Similar to how we allow users to define their custom MetricReporter. > > Thank you~ > > Xintong Song > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <becket....@gmail.com> wrote: > > > Thanks for the comment, Stephan. > > > > - If everything becomes a "core feature", it will make the project hard > > > to develop in the future. Thinking "library" / "plugin" / "extension" > > style > > > where possible helps. > > > > > > Completely agree. It is much more important to design a mechanism than > > focusing on a specific case. Here is what I am thinking to fully support > > custom resource management: > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to define > the > > resource and the amount required. They will be used to find suitable TMs > > slots to run the tasks. At this point, the resources are only measured by > > amount, i.e. they do not have individual ID. > > > > 2. On the TM side, have something like *"ResourceInfoProvider"* to > identify > > and provides the detail information of the individual resource, e.g. GPU > > ID.. It is important because the operator may have to explicitly interact > > with the physical resource it uses. The ResourceInfoProvider might look > > like something below. > > interface ResourceInfoProvider<INFO> { > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId, > > ResourceProfile resourceProfile); > > } > > > > - There could be several "*ResourceInfoProvider*" configured on the TM to > > retrieve the information for different resources. > > - The TM will be responsible to assign those individual resources to each > > operator according to their requested amount. > > - The operators will be able to get the ResourceInfo from their > > RuntimeContext. > > > > If we agree this is a reasonable final state. We can adapt the current > FLIP > > to it. In fact it does not sound a big change to me. All the proposed > > configuration can be as is, it is just that Flink itself won't care about > > them, instead a GPUInfoProviver implementing the ResourceInfoProvider > will > > use them. > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <se...@apache.org> wrote: > > > > > Hi all! > > > > > > The main point I wanted to throw into the discussion is the following: > > > - With more and more use cases, more and more tools go into Flink > > > - If everything becomes a "core feature", it will make the project > hard > > > to develop in the future. Thinking "library" / "plugin" / "extension" > > style > > > where possible helps. > > > > > > - A good thought experiment is always: How many future developers > have > > to > > > interact with this code (and possibly understand it partially), even if > > the > > > features they touch have nothing to do with GPU support. If many > > > contributors to unrelated features will have to touch it and understand > > it, > > > then let's think if there is a different solution. Maybe there is not, > > but > > > then we should be sure why. > > > > > > - That led me to raising this issue: If the GPU manager becomes a > core > > > service in the TaskManager, Environment, RuntimeContext, etc. then > > everyone > > > developing TM and streaming tasks need to understand the GPU manager. > > That > > > seems oddly specific, is my impression. > > > > > > Access to configuration seems not the right reason to do that. We > should > > > expose the Flink configuration from the RuntimeContext anyways. > > > > > > If GPUs are sliced and assigned during scheduling, there may be reason, > > > although it looks that it would belong to the slot then. Is that what > we > > > are doing here? > > > > > > Best, > > > Stephan > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <tonysong...@gmail.com> > > > wrote: > > > > > > > Thanks for the feedback, Becket. > > > > > > > > IMO, eventually an operator should only see info of GPUs that are > > > dedicated > > > > for it, instead of all GPUs on the machine/container in the current > > > design. > > > > It does not make sense to let the user who writes a UDF to worry > about > > > > coordination among multiple operators running on the same machine. > And > > if > > > > we want to limit the GPU info an operator sees, we should not let the > > > > operator to instantiate GPUManager, which means we have to expose > > > something > > > > through runtime context, either GPU info or some kind of limited > access > > > to > > > > the GPUManager. > > > > > > > > Thank you~ > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <becket....@gmail.com> > > wrote: > > > > > > > > > It probably make sense for us to first agree on the final state. > More > > > > > specifically, will the resource info be exposed through runtime > > context > > > > > eventually? > > > > > > > > > > If that is the final state and we have a seamless migration story > > from > > > > this > > > > > FLIP to that final state, Personally I think it is OK to expose the > > GPU > > > > > info in the runtime context. > > > > > > > > > > Thanks, > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song < > tonysong...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > @Yangze, > > > > > > I think what Stephan means (@Stephan, please correct me if I'm > > wrong) > > > > is > > > > > > that, we might not need to hold and maintain the GPUManager as a > > > > service > > > > > in > > > > > > TaskManagerServices or RuntimeContext. An alternative is to > create > > / > > > > > > retrieve the GPUManager only in the operators that need it, e.g., > > > with > > > > a > > > > > > static method `GPUManager.get()`. > > > > > > > > > > > > @Stephan, > > > > > > I agree with you on excluding GPUManager from > TaskManagerServices. > > > > > > > > > > > > - For the first step, where we provide unified TM-level GPU > > > > > information > > > > > > to all operators, it should be fine to have operators access / > > > > > > lazy-initiate GPUManager by themselves. > > > > > > - In future, we might have some more fine-grained GPU > > management, > > > > > where > > > > > > we need to maintain GPUManager as a service and put GPU info > in > > > slot > > > > > > profiles. But at least for now it's not necessary to introduce > > > such > > > > > > complexity. > > > > > > > > > > > > However, I have some concerns on excluding GPUManager from > > > > RuntimeContext > > > > > > and let operators access it directly. > > > > > > > > > > > > - Configurations needed for creating the GPUManager is not > > always > > > > > > available for operators. > > > > > > - If later we want to have fine-grained control over GPU > (e.g., > > > > > > operators in each slot can only see GPUs reserved for that > > slot), > > > > the > > > > > > approach cannot be easily extended. > > > > > > > > > > > > I would suggest to wrap the GPUManager behind RuntimeContext and > > only > > > > > > expose the GPUInfo to users. For now, we can declare a method > > > > > > `getGPUInfo()` in RuntimeContext, with a default definition that > > > calls > > > > > > `GPUManager.get()` to get the lazily-created GPUManager. If later > > we > > > > want > > > > > > to create / retrieve GPUManager in a different way, we can simply > > > > change > > > > > > how `getGPUInfo` is implemented, without needing to change any > > public > > > > > > interfaces. > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <karma...@gmail.com> > > > > wrote: > > > > > > > > > > > > > @Shephan > > > > > > > Do you mean Minicluster? Yes, it makes sense to share the GPU > > > Manager > > > > > > > in such scenario. > > > > > > > If that's what you worry about, I'm +1 for holding > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor instead of > > > > > > > TaskManagerServices. > > > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it just holds the > > GPU > > > > > > > info instead of the GPU Manager. AFAIK, it's the only place we > > > could > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction. > > > > > > > > > > > > > > Best, > > > > > > > Yangze Guo > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried < > > > is...@paddlesoft.net > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000 se...@apache.org > wrote > > > > ---- > > > > > > > > > > > > > > > > > > Can we somehow keep this out of the TaskManager services > > > > > > > > > I fear that we could not. IMO, the GPUManager(or > > > > > > > > > ExternalServicesManagers in future) is conceptually one of > > the > > > > task > > > > > > > > > manager services, just like MemoryManager before 1.10. > > > > > > > > > - It maintains/holds the GPU resource at TM level and all > of > > > the > > > > > > > > > operators allocate the GPU resources from it. So, it should > > be > > > > > > > > > exclusive to a single TaskExecutor. > > > > > > > > > - We could add a collection called ExternalResourceManagers > > to > > > > hold > > > > > > > > > all managers of other external resources in the future. > > > > > > > > > > > > > > > > > > > > > > > > > Can you help me understand why this needs the addition in > > > > > > > TaskMagerServices > > > > > > > > or in the RuntimeContext? > > > > > > > > Are you worried about the case when multiple Task Executors > run > > > in > > > > > the > > > > > > > same > > > > > > > > JVM? That's not common, but wouldn't it actually be good in > > that > > > > case > > > > > > to > > > > > > > > share the GPU Manager, given that the GPU is shared? > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Stephan > > > > > > > > > > > > > > > > --------------------------- > > > > > > > > > > > > > > > > > > > > > > > > > What parts need information about this? > > > > > > > > > In this FLIP, operators need the information. Thus, we > expose > > > GPU > > > > > > > > > information to the RuntimeContext/FunctionContext. The slot > > > > profile > > > > > > is > > > > > > > > > not aware of GPU resources as GPU is TM level resource now. > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self contained" thing that > simply > > > > takes > > > > > > the > > > > > > > > > configuration, and then abstracts everything internally? > > > > > > > > > Yes, we just pass the path/args of the discover script and > > how > > > > many > > > > > > > > > GPUs per TM to it. It takes the responsibility to get the > GPU > > > > > > > > > information and expose them to the > > > RuntimeContext/FunctionContext > > > > > of > > > > > > > > > Operators. Meanwhile, we'd better not allow operators to > > > directly > > > > > > > > > access GPUManager, it should get what they want from > Context. > > > We > > > > > > could > > > > > > > > > then decouple the interface/implementation of GPUManager > and > > > > Public > > > > > > > > > API. > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Yangze Guo > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen < > > se...@apache.org > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > It sounds fine to initially start with GPU specific > support > > > and > > > > > > think > > > > > > > > > about > > > > > > > > > > generalizing this once we better understand the space. > > > > > > > > > > > > > > > > > > > > About the implementation suggested in FLIP-108: > > > > > > > > > > - Can we somehow keep this out of the TaskManager > services? > > > > > > Anything > > > > > > > we > > > > > > > > > > have to pull through all layers of the TM makes the TM > > > > components > > > > > > yet > > > > > > > > > more > > > > > > > > > > complex and harder to maintain. > > > > > > > > > > > > > > > > > > > > - What parts need information about this? > > > > > > > > > > -> do the slot profiles need information about the GPU? > > > > > > > > > > -> Can the GPU Manager be a "self contained" thing that > > > simply > > > > > > takes > > > > > > > > > > the configuration, and then abstracts everything > > internally? > > > > > > > Operators > > > > > > > > > can > > > > > > > > > > access it via "GPUManager.get()" or so? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo < > > > karma...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks. > > > > > > > > > > > > > > > > > > > > > > @Becket > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right, I'll add > > > them > > > > to > > > > > > the > > > > > > > > > > > Public API section. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > @Stephan @Becket > > > > > > > > > > > Regarding the general extended resource mechanism, I > > second > > > > > > > Xintong's > > > > > > > > > > > suggestion. > > > > > > > > > > > - It's better to leverage ResourceProfile and > > ResourceSpec > > > > > after > > > > > > we > > > > > > > > > > > supporting fine-grained GPU scheduling. As a first step > > > > > > proposal, I > > > > > > > > > > > prefer to not include it in the scope of this FLIP. > > > > > > > > > > > - Regarding the "Extended Resource Manager", if I > > > understand > > > > > > > > > > > correctly, it just a code refactoring atm, we could > > extract > > > > the > > > > > > > > > > > open/close/allocateExtendResources of GPUManager to > that > > > > > > > interface. If > > > > > > > > > > > that is the case, +1 to do it during implementation. > > > > > > > > > > > > > > > > > > > > > > @Xingbo > > > > > > > > > > > As Xintong said, we looked into how Spark supports a > > > general > > > > > > > "Custom > > > > > > > > > > > Resource Scheduling" before and decided to introduce a > > > common > > > > > > > resource > > > > > > > > > > > configuration > > > > > > > > > > > > > > > > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script) > > > > > > > > > > > to make it more extensible. I think the "resource" is a > > > > proper > > > > > > > level > > > > > > > > > > > to contain all the configs of extended resources. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Yangze Guo > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang < > > > > > hxbks...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze. > > > > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU resource management > support > > > will > > > > > > > greatly > > > > > > > > > > > > facilitate the development of AI-related applications > > by > > > > > > PyFlink > > > > > > > > > users. > > > > > > > > > > > > > > > > > > > > > > > > I have only one comment about this wiki: > > > > > > > > > > > > > > > > > > > > > > > > Regarding the names of several GPU configurations, I > > > think > > > > it > > > > > > is > > > > > > > > > better > > > > > > > > > > > to > > > > > > > > > > > > delete the resource field makes it consistent with > the > > > > names > > > > > of > > > > > > > other > > > > > > > > > > > > resource-related configurations in TaskManagerOption. > > > > > > > > > > > > > > > > > > > > > > > > e.g. taskmanager.resource.gpu.discovery-script.path > -> > > > > > > > > > > > > taskmanager.gpu.discovery-script.path > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > > > > > > > Xingbo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <tonysong...@gmail.com> 于2020年3月4日周三 > > > > 上午10:39写道: > > > > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket, > > > > > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an offline > > > > discussion > > > > > > > about > > > > > > > > > > > making > > > > > > > > > > > > > the "GPU Support" as some general "Extended > Resource > > > > > > Support". > > > > > > > We > > > > > > > > > > > believe > > > > > > > > > > > > > supporting extended resources in a general > mechanism > > is > > > > > > > definitely > > > > > > > > > a > > > > > > > > > > > good > > > > > > > > > > > > > and extensible way. The reason we propose this FLIP > > > > > narrowing > > > > > > > its > > > > > > > > > scope > > > > > > > > > > > > > down to GPU alone, is mainly for the concern on > extra > > > > > efforts > > > > > > > and > > > > > > > > > > > review > > > > > > > > > > > > > capacity needed for a general mechanism. > > > > > > > > > > > > > > > > > > > > > > > > > > To come up with a well design on a general extended > > > > > resource > > > > > > > > > management > > > > > > > > > > > > > mechanism, we would need to investigate more on how > > > > people > > > > > > use > > > > > > > > > > > different > > > > > > > > > > > > > kind of resources in practice. For GPU, we learnt > > such > > > > > > > knowledge > > > > > > > > > from > > > > > > > > > > > the > > > > > > > > > > > > > experts, Becket and his team members. But for FPGA, > > or > > > > > other > > > > > > > > > potential > > > > > > > > > > > > > extended resources, we don't have such convenient > > > > > information > > > > > > > > > sources, > > > > > > > > > > > > > making the investigation requires more efforts, > > which I > > > > > tend > > > > > > to > > > > > > > > > think > > > > > > > > > > > is > > > > > > > > > > > > > not necessary atm. > > > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, we also looked into how Spark > > > > supports a > > > > > > > general > > > > > > > > > > > "Custom > > > > > > > > > > > > > Resource Scheduling". Assuming we want to have a > > > similar > > > > > > > general > > > > > > > > > > > extended > > > > > > > > > > > > > resource mechanism in the future, we believe that > the > > > > > current > > > > > > > GPU > > > > > > > > > > > support > > > > > > > > > > > > > design can be easily extended, in an incremental > way > > > > > without > > > > > > > too > > > > > > > > > many > > > > > > > > > > > > > reworks. > > > > > > > > > > > > > > > > > > > > > > > > > > - The most important part is probably user > > interfaces. > > > > > Spark > > > > > > > > > offers > > > > > > > > > > > > > configuration options to define the amount, > discovery > > > > > script > > > > > > > and > > > > > > > > > > > vendor > > > > > > > > > > > > > (on > > > > > > > > > > > > > k8s) in a per resource type bias [1], which is very > > > > similar > > > > > > to > > > > > > > > > what > > > > > > > > > > > we > > > > > > > > > > > > > proposed in this FLIP. I think it's not necessary > to > > > > expose > > > > > > > > > config > > > > > > > > > > > > > options > > > > > > > > > > > > > in the general way atm, since we do not have > supports > > > for > > > > > > other > > > > > > > > > > > resource > > > > > > > > > > > > > types now. If later we decided to have per resource > > > type > > > > > > config > > > > > > > > > > > > > options, we > > > > > > > > > > > > > can have backwards compatibility on the current > > > proposed > > > > > > > options > > > > > > > > > > > with > > > > > > > > > > > > > simple key mapping. > > > > > > > > > > > > > - For the GPU Manager, if later needed we can > change > > it > > > > to > > > > > a > > > > > > > > > > > "Extended > > > > > > > > > > > > > Resource Manager" (or whatever it is called). That > > > should > > > > > be > > > > > > a > > > > > > > > > pure > > > > > > > > > > > > > component-internal refactoring. > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec, there are > > > already > > > > > > > > > fields for > > > > > > > > > > > > > general extended resource. We can of course > leverage > > > them > > > > > > when > > > > > > > > > > > > > supporting > > > > > > > > > > > > > fine grained GPU scheduling. That is also not in > the > > > > scope > > > > > of > > > > > > > > > this > > > > > > > > > > > first > > > > > > > > > > > > > step proposal, and would require FLIP-56 to be > > finished > > > > > > first. > > > > > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree with Becket that have a > > separate > > > > > FLIP > > > > > > > for > > > > > > > > > the > > > > > > > > > > > > > general extended resource mechanism, and keep it in > > > mind > > > > > when > > > > > > > > > > > discussing > > > > > > > > > > > > > and implementing the current one. > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin < > > > > > > > becket....@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > That's a good point, Stephan. It makes total > sense > > to > > > > > > > generalize > > > > > > > > > the > > > > > > > > > > > > > > resource management to support custom resources. > > > Having > > > > > > that > > > > > > > > > allows > > > > > > > > > > > users > > > > > > > > > > > > > > to add new resources by themselves. The general > > > > resource > > > > > > > > > management > > > > > > > > > > > may > > > > > > > > > > > > > > involve two different aspects: > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type definition. It is > > > supported > > > > > by > > > > > > > the > > > > > > > > > > > extended > > > > > > > > > > > > > > resources in ResourceProfile and ResourceSpec. > This > > > > will > > > > > > > likely > > > > > > > > > cover > > > > > > > > > > > > > > majority of the cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource allocation logic, i.e. how > > to > > > > > assign > > > > > > > the > > > > > > > > > > > resources > > > > > > > > > > > > > > to different tasks, operators, and so on. This > may > > > > > require > > > > > > > two > > > > > > > > > > > levels / > > > > > > > > > > > > > > steps: > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks are put > > > into > > > > > > > > > suitable > > > > > > > > > > > > > slots. > > > > > > > > > > > > > > It is done by the global RM and is not > customizable > > > > right > > > > > > > now. > > > > > > > > > > > > > > b. Operator level - map the exact resource to the > > > > > operators > > > > > > > > > in > > > > > > > > > > > TM. > > > > > > > > > > > > > e.g. > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator B. This > > step > > > > is > > > > > > > needed > > > > > > > > > > > assuming > > > > > > > > > > > > > > the global RM does not distinguish individual > > > resources > > > > > of > > > > > > > the > > > > > > > > > same > > > > > > > > > > > type. > > > > > > > > > > > > > > It is true for memory, but not for GPU. > > > > > > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b here. So it > > > > should > > > > > > > > > discover the > > > > > > > > > > > > > > physical GPU information and bind/match them to > > each > > > > > > > operators. > > > > > > > > > > > Making > > > > > > > > > > > > > this > > > > > > > > > > > > > > general will fill in the missing piece to support > > > > custom > > > > > > > resource > > > > > > > > > > > type > > > > > > > > > > > > > > definition. But I'd avoid calling it a "External > > > > Resource > > > > > > > > > Manager" to > > > > > > > > > > > > > avoid > > > > > > > > > > > > > > confusion with RM, maybe something like "Operator > > > > > Resource > > > > > > > > > Assigner" > > > > > > > > > > > > > would > > > > > > > > > > > > > > be more accurate. So for each resource type users > > can > > > > > have > > > > > > an > > > > > > > > > > > optional > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM. For > memory, > > > > users > > > > > > > don't > > > > > > > > > need > > > > > > > > > > > > > this, > > > > > > > > > > > > > > but for other extended resources, users may need > > > that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Personally I think a pluggable "Operator Resource > > > > > Assigner" > > > > > > > is > > > > > > > > > > > achievable > > > > > > > > > > > > > > in this FLIP. But I am also OK with having that > in > > a > > > > > > separate > > > > > > > > > FLIP > > > > > > > > > > > > > because > > > > > > > > > > > > > > the interface between the "Operator Resource > > > Assigner" > > > > > and > > > > > > > > > operator > > > > > > > > > > > may > > > > > > > > > > > > > > take a while to settle down if we want to make it > > > > > generic. > > > > > > > But I > > > > > > > > > > > think > > > > > > > > > > > > > our > > > > > > > > > > > > > > implementation should take this future work into > > > > > > > consideration so > > > > > > > > > > > that we > > > > > > > > > > > > > > don't need to break backwards compatibility once > we > > > > have > > > > > > > that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan Ewen < > > > > > > > se...@apache.org> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing this FLIP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give much input into the > > mechanics > > > of > > > > > > > GPU-aware > > > > > > > > > > > > > > scheduling > > > > > > > > > > > > > > > and GPU allocation, as I have no experience > with > > > > that. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > One thought I had when reading the proposal is > if > > > it > > > > > > makes > > > > > > > > > sense to > > > > > > > > > > > > > look > > > > > > > > > > > > > > at > > > > > > > > > > > > > > > the "GPU Manager" as an "External Resource > > > Manager", > > > > > and > > > > > > > GPU > > > > > > > > > is one > > > > > > > > > > > > > such > > > > > > > > > > > > > > > resource. > > > > > > > > > > > > > > > The way I understand the ResourceProfile and > > > > > > ResourceSpec, > > > > > > > > > that is > > > > > > > > > > > how > > > > > > > > > > > > > it > > > > > > > > > > > > > > > is done there. > > > > > > > > > > > > > > > It has the advantage that it looks more > > extensible. > > > > > Maybe > > > > > > > > > there is > > > > > > > > > > > a > > > > > > > > > > > > > GPU > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU Resource, > and > > > FPGA > > > > > > > > > Resource, a > > > > > > > > > > > > > Alibaba > > > > > > > > > > > > > > > TPU Resource, etc. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > Stephan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket Qin < > > > > > > > > > becket....@gmail.com> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU resource > > > management > > > > > > > support > > > > > > > > > is a > > > > > > > > > > > > > > > must-have > > > > > > > > > > > > > > > > for machine learning use cases. Actually it > is > > > one > > > > of > > > > > > the > > > > > > > > > mostly > > > > > > > > > > > > > asked > > > > > > > > > > > > > > > > question from the users who are interested in > > > using > > > > > > Flink > > > > > > > > > for ML. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Some quick comments / questions to the wiki. > > > > > > > > > > > > > > > > 1. The WebUI / REST API should probably also > be > > > > > > > mentioned in > > > > > > > > > the > > > > > > > > > > > > > public > > > > > > > > > > > > > > > > interface section. > > > > > > > > > > > > > > > > 2. Is the data structure that holds GPU info > > > also a > > > > > > > public > > > > > > > > > API? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong Song > < > > > > > > > > > > > tonysong...@gmail.com> > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and kicking > off > > > the > > > > > > > > > discussion, > > > > > > > > > > > > > Yangze. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting using > of > > > GPU > > > > in > > > > > > > Flink > > > > > > > > > is > > > > > > > > > > > > > > > significant, > > > > > > > > > > > > > > > > > especially for the ML scenarios. > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and it > looks > > > good > > > > > to > > > > > > > me. I > > > > > > > > > > > think > > > > > > > > > > > > > > it's a > > > > > > > > > > > > > > > > > very good first step for Flink's GPU > > supports. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM Yangze Guo > < > > > > > > > > > karma...@gmail.com > > > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a discussion > thread > > on > > > > > > > "FLIP-108: > > > > > > > > > Add > > > > > > > > > > > GPU > > > > > > > > > > > > > > > > > > support in Flink"[1]. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the following > > > > issues: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to configure how many GPUs > > in a > > > > > task > > > > > > > > > executor > > > > > > > > > > > and > > > > > > > > > > > > > > > > > > forward such requirements to the external > > > > > resource > > > > > > > > > managers > > > > > > > > > > > (for > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups). > > > > > > > > > > > > > > > > > > - Provide information of available GPU > > > > resources > > > > > to > > > > > > > > > > > operators. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP are as > > > > follows: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource requirements to > > > > > > > Yarn/Kubernetes. > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of the task > > > > manager > > > > > > > > > services to > > > > > > > > > > > > > > > discover > > > > > > > > > > > > > > > > > > and expose GPU resource information to > the > > > > > context > > > > > > of > > > > > > > > > > > functions. > > > > > > > > > > > > > > > > > > - Introduce the default script for GPU > > > > discovery, > > > > > > in > > > > > > > > > which we > > > > > > > > > > > > > > provide > > > > > > > > > > > > > > > > > > the privilege mode to help user to > achieve > > > > > > > worker-level > > > > > > > > > > > isolation > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > > > > standalone mode. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP wiki > > > > > document > > > > > > > [1]. > > > > > > > > > > > Looking > > > > > > > > > > > > > > > forward > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > > > > your feedbacks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > > > > > > > Yangze Guo > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >