Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-03 Thread Kezhu Wang
Hi Xintong, Thanks for the backgrounds! I understand the impractical of operator level specifications and the value of group level specifications. Just not that confident about “Coupling between operator chaining / slot sharing”, seems to me, it requires more knowledge than “Expose operator chain

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-03 Thread Xintong Song
Hi Kezhu, Maybe let me share some backgrounds first. - We at Alibaba have been using fine-grained resource management for many years, with Blink (an internal version of Flink). - We have been trying to contribute this feature to Apache Flink since many years ago. However, we haven't s

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-03 Thread Kezhu Wang
Hi Till, Based on what I understood, if not wrong, the door is not closed after SSG resource specifying. So, hope it could be useful in potential future improvement. Best, Kezhu Wang On February 3, 2021 at 18:07:21, Till Rohrmann (trohrm...@apache.org) wrote: Thanks for sharing your thoughts K

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-03 Thread Kezhu Wang
Hi, Yangze and Xintong, thank you for replies. I indeed make assumptions, I list them here in order: 1. There is only task/LogicalSlot level resource specification in runtime. And it comes from api side and is respected in runtime. 2. Current operator level resource specification in client side is

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-03 Thread Till Rohrmann
Thanks for sharing your thoughts Kezhu. I like your ideas of how per-operator and SSG requirements can be combined. I've also thought about defining a default resource profile for all tasks which have no resources configured. That way all operators would have resources assigned if the user chooses

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-02 Thread Xintong Song
Thanks for your feedback, Kezhu. I think Flink *runtime* already has an ideal granularity for resource > management 'task'. If there is > a slot shared by multiple tasks, that slot's resource requirement is simple > sum of all its logical > slots. So basically, this is no resource requirement for

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-02 Thread Yangze Guo
Hi, Kezhu. Thanks for your feedback. > Flink *runtime* already has an ideal granularity for resource management > 'task'. As mentioned in FLIP, there are some ancient codes in Flink code base, but these codes are never really used and exposed to user. So, there is actually no operator or SSG lev

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-02-02 Thread Kezhu Wang
Hi all, sorry for join discussion even after voting started. I want to share my thoughts on this after reading above discussions. I think Flink *runtime* already has an ideal granularity for resource management 'task'. If there is a slot shared by multiple tasks, that slot's resource requirement

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-31 Thread Yangze Guo
Thanks for reply, Till and Xintong! I update the FLIP, including: - Edit the JavaDoc of the proposed StreamGraphGenerator#setSlotSharingGroupResource. - Add "Future Plan" section, which contains the potential follow-up issues and the limitations to be documented when fine-grained resource manageme

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-29 Thread Till Rohrmann
Thanks for summarizing the discussion, Yangze. I agree that setting resource requirements per operator is not very user friendly. Moreover, I couldn't come up with a different proposal which would be as easy to use and wouldn't expose internal scheduling details. In fact, following this argument th

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-25 Thread Xintong Song
Thanks for the summary, Yangze. The changes and follow-up issues LGTM. Let's wait for responses from the others before starting a vote. Thank you~ Xintong Song On Tue, Jan 26, 2021 at 11:08 AM Yangze Guo wrote: > Thanks everyone for the lively discussion. I'd like to try to > summarize the

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-25 Thread Yangze Guo
Thanks everyone for the lively discussion. I'd like to try to summarize the current convergence in the discussion. Please let me know if I got things wrong or missed something crucial here. Change of this FLIP: - Treat the SSG resource requirements as a hint instead of a restriction for the runtim

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-21 Thread Xintong Song
FGRuntimeInterface.png Thank you~ Xintong Song On Fri, Jan 22, 2021 at 11:11 AM Xintong Song wrote: > I think Chesnay's proposal could actually work. IIUC, the keypoint is to > derive operator requirement

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-21 Thread Xintong Song
I think Chesnay's proposal could actually work. IIUC, the keypoint is to derive operator requirements from SSG requirements on the API side, so that the runtime only deals with operator requirements. It's debatable how the deriving should be done though. E.g., an alternative could be to evenly divi

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-21 Thread Chesnay Schepler
You're raising a good point, but I think I can rectify that with a minor adjustment. Default requirements are whatever the default requirements are, setting the requirements for one operator has no effect on other operators. With these rules, and some API enhancements, the following mockup wo

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-21 Thread Xintong Song
I second Till's concern about implicitly interpreting zero resource requirements for unspecified operators. I'm not against allowing both specifying SSG requirements as shortcuts and further refining operator requirements as needed. Combining Till's idea, we can do the following. - Prefer using o

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-21 Thread Till Rohrmann
If I understand you correctly Chesnay, then you want to decouple the resource requirement specification from the slot sharing group assignment. Hence, per default all operators would be in the same slot sharing group. If there is no operator with a resource specification, then the system would allo

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-20 Thread Chesnay Schepler
Is there even a functional difference between specifying the requirements for an SSG vs specifying the same requirements on a single operator within that group (ideally a colocation group to avoid this whole hint business)? Wouldn't we get the best of both worlds in the latter case? Users can

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-20 Thread Yangze Guo
@Till Also +1 to treat the SSG resource requirements as a hint instead of a restrict. We can treat it as a follow-up effort and make it clear in JavaDocs at the first step. Best, Yangze Guo On Thu, Jan 21, 2021 at 10:00 AM Xintong Song wrote: > > I think this makes sense. > > The semantic of a

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-20 Thread Xintong Song
I think this makes sense. The semantic of a SSG is that operators in the group *can* be scheduled together in a slot, which is not a *must*. Specifying resources for SSGs should not change that semantic. In cases that needs for scheduling the operators into different slots arise, it makes sense fo

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-20 Thread Till Rohrmann
Maybe a different minor idea: Would it be possible to treat the SSG resource requirements as a hint for the runtime similar to how slot sharing groups are designed at the moment? Meaning that we don't give the guarantee that Flink will always deploy this set of tasks together no matter what comes.

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Yangze Guo
Thanks for the responses, Till and Xintong. I second Xintong's comment that SSG-based runtime interface will give us the flexibility to achieve op/task-based approach. That's one of the most important reasons for our design choice. Some cents regarding the default operator resource: - It might be

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Xintong Song
Thanks for the feedback, Till. ## I feel that what you proposed (operator-based + default value) might be subsumed by the SSG-based approach. Thinking of op_1 -> op_2, there are the following 4 cases, categorized by whether the resource requirements are known to the users. 1. *Both known.* As

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Till Rohrmann
Thanks for the responses Xintong and Stephan, I agree that being able to define the resource requirements for a group of operators is more user friendly. However, my concern is that we are exposing thereby internal runtime strategies which might limit our flexibility to execute a given job. Moreov

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-19 Thread Xintong Song
Thanks for the feedback, Stephan. Actually, your proposal has also come to my mind at some point. And I have some concerns about it. 1. It does not give users the same control as the SSG-based approach. While both approaches do not require specifying for each operator, SSG-based approach suppo

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-18 Thread Stephan Ewen
Thanks a lot, Yangze and Xintong for this FLIP. I want to say, first of all, that this is super well written. And the points that the FLIP makes about how to expose the configuration to users is exactly the right thing to figure out first. So good job here! About how to let users specify the reso

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-07 Thread Xintong Song
Thanks for drafting the FLIP and driving the discussion, Yangze. And Thanks for the feedback, Till and Chesnay. @Till, I agree that specifying requirements for SSGs means that SSGs need to be supported in fine-grained resource management, otherwise each operator might use as many resources as the

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-07 Thread Yangze Guo
Thanks for your feedback. @Till > the only option for a scheduler which does not support slot sharing groups is > to say that every operator in this slot sharing group needs a slot with the > same resources as the whole group. At the moment, all the implementations of the scheduler respect the s

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-07 Thread Chesnay Schepler
Will declaring them on slot sharing groups not also waste resources if the parallelism of operators within that group are different? It also seems like quite a hassle for users having to recalculate the resource requirements if they change the slot sharing. I'd think that it's not really workab

Re: [DISCUSS] FLIP-156: Runtime Interfaces for Fine-Grained Resource Requirements

2021-01-07 Thread Till Rohrmann
Thanks for drafting this FLIP and starting this discussion Yangze. I like that defining resource requirements on a slot sharing group makes the overall setup easier and improves usability of resource requirements. What I do not like about it is that it changes slot sharing groups from being a sch