Hi All, Thanks for all feedbacks from you. All of them are helpful and valuable for us.
If there is no further comment towards FLIP-250 we introduced, we plan to setup a VOTE thread next Monday. Thank you all !! BR Bo Zhao bo zhaobo <bzhaojyathousa...@gmail.com> 于2022年7月15日周五 10:02写道: > Thanks all, @Yang Wang and @Yikun Jiang. > > Hi Martijn, > > We understand your concern. And do the above emails clear your doubts? > > " > Thanks for the info! I think I see that you've already updated the FLIP to > reflect how customized schedulers are beneficial for both batch and > streaming jobs. > " > > >>> > > Yeah, that's true that the "Motivation" paragraph makes readers confused. > So > I updated the FLIP description. And thanks for your feedback and correct. > > " > The reason why I'm not too happy that we would only create a reference > implementation for Volcano is that we don't know if the generic support for > customized scheduler plugins will also work for others. We think it will, > but since there would be no other implementation available, we are not > sure. My concern is that when someone tries to add support for another > scheduler, we notice that we actually made a mistake or should improve the > generic support. > " > > >>> > > Yeah, I understand your concern. Via YiKun Jinag's description and > experience sharing, > does he make you know more? Or we need to figure out the common part of > some popular > K8S customized schedulers and refresh the doc? Waiting for your advice. > ;-) > > Best regards, > > Bo Zhao > > Yikun Jiang <yikunk...@gmail.com> 于2022年7月14日周四 18:45写道: > >> > And maybe we also could ping Yikun Jiang who has done similar things in >> Spark. >> >> Thanks for @wangyang ping. Yes, I was involved in Spark's customized >> scheduler support work and as the main completer. >> >> For customized scheduler support, I can share scheduler's requirement in >> here: >> >> 1. Help scheduler to *specify* the scheduler name >> >> 2. Help scheduler to create the* scheduler related label/annotation/CRD*, >> such as >> - Yunikorn needs labels/annotations >> < >> https://yunikorn.apache.org/docs/user_guide/labels_and_annotations_in_yunikorn/ >> > >> (maybe task group CRD in future or not) >> - Volcano needs annotations and CRD <https://volcano.sh/en/docs/podgroup/ >> > >> - Kube-batch needs annotations/CRD >> <https://github.com/kubernetes-sigs/kube-batch/tree/master/config/crds> >> - Kueue needs annotation support >> < >> https://github.com/kubernetes-sigs/kueue/blob/888cedb6e62c315e008916086308a893cd21dd66/config/samples/sample-job.yaml#L6 >> > >> and >> cluster level CRD >> >> 3. Help the scheduler to create the scheduler meta/CRD at the* right >> time*, >> such as if users want to avoid pod max pending, we need to create the >> scheduler required CRD before pod creation. >> >> For complex requirements, Spark uses featurestep to support (looks flink >> decorators are very similar to it) >> For simple requirements, they can just use configuration or Pod Template. >> [1] >> >> https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes >> >> From the FLIP, I can see the above requirements are covered. >> >> BTW, I think Flink decorators' existing and new added interface have >> already covered all requirements of Kubernetes, so I personally think the >> K8s related scheduler requirement can also be well covered by it. >> >> Regards, >> Yikun >> >> >> On Thu, Jul 14, 2022 at 5:11 PM Yang Wang <danrtsey...@gmail.com> wrote: >> >> > I think we could go over the customized scheduler plugin mechanism again >> > with YuniKorn to make sure that it is common enough. >> > But the implementation could be deferred. >> > >> > And maybe we also could ping Yikun Jiang who has done similar things in >> > Spark. >> > >> > For the e2e tests, I admit that they could be improved. But I am not >> sure >> > whether we really need the java implementation instead. >> > This is out of the scope of this FLIP and let's keep the discussion >> > under FLINK-20392. >> > >> > >> > Best, >> > Yang >> > >> > Martijn Visser <martijnvis...@apache.org> 于2022年7月14日周四 15:28写道: >> > >> > > Hi Bo, >> > > >> > > Thanks for the info! I think I see that you've already updated the >> FLIP >> > to >> > > reflect how customized schedulers are beneficial for both batch and >> > > streaming jobs. >> > > >> > > The reason why I'm not too happy that we would only create a reference >> > > implementation for Volcano is that we don't know if the generic >> support >> > for >> > > customized scheduler plugins will also work for others. We think it >> will, >> > > but since there would be no other implementation available, we are not >> > > sure. My concern is that when someone tries to add support for another >> > > scheduler, we notice that we actually made a mistake or should improve >> > the >> > > generic support. >> > > >> > > Best regards, >> > > >> > > Martijn >> > > >> > > >> > > >> > > Op do 14 jul. 2022 om 05:30 schreef bo zhaobo < >> > bzhaojyathousa...@gmail.com >> > > >: >> > > >> > > > Hi Martijn, >> > > > >> > > > Thank you for your comments. I will answer the questions one by one. >> > > > >> > > > "" >> > > > * Regarding the motivation, it mentions that the development trend >> is >> > > that >> > > > Flink supports both batch and stream processing. I think the vision >> and >> > > > trend is that we have unified batch- and stream processing. What I'm >> > > > missing is the vision on what's the impact for customized Kubernetes >> > > > schedulers on stream processing. Could there be some elaboration on >> > that? >> > > > "" >> > > > >> > > > >> >> > > > >> > > > We very much agree with you and the dev trend that Flink supports >> both >> > > > batch and stream processing. Actually, using the K8S customized >> > scheduler >> > > > is beneficial for streaming scenarios too, such as avoiding resource >> > > > deadlock and other problems, for example, the remaining resources in >> > the >> > > > K8S cluster are only enough for one job running, but we submitted >> two. >> > At >> > > > this time, both jobs will be prevented and hang from requesting >> > resources >> > > > at the same time when using the default K8S scheduler, but in this >> > case, >> > > > the customized scheduler Volcano won’t schedule overcommit pods if >> the >> > > idle >> > > > can not fit all following pods setup. So the benefits mentioned in >> FLIP >> > > are >> > > > not only for batch jobs. In fact, the said 4 scheduling capabilities >> > > > mentioned in FLIP are all required for stream processing. YARN has >> some >> > > of >> > > > those scheduling features too, such as priority scheduling, min/max >> > > > resource constraint and etc... >> > > > >> > > > "" >> > > > * While the FLIP talks about customized schedulers, it focuses on >> > > Volcano. >> > > > Why is the choice made to only focus on Volcano and not on other >> > > schedulers >> > > > like Apache YuniKorn? Can we not also provide an implementation for >> > > > YuniKorn at the same time? Spark did the same with SPARK-36057 [1] >> > > > "" >> > > > >> > > > >> >> > > > >> > > > Let's make it more clear about this. The FLIP consists of two parts: >> > > > 1. Introducing Flink K8S supports the customized scheduler plugin >> > > > mechanism. This aspect is a general consideration. >> > > > 2. Introducing ONE reference implementation for the customized >> > scheduler, >> > > > volcano is just one of them, if other schedulers or people are >> > > interested, >> > > > the integration of other schedulers can also be easily completed. >> > > > >> > > > "" >> > > > * We still have quite a lot of tech debt on testing for Kubernetes >> > [2]. I >> > > > think that this FLIP would be a great improvement for Flink, but I >> am >> > > > worried that we will add more tech debt to those tests. Can we >> somehow >> > > > improve this situation? >> > > > "" >> > > > >> > > > >> >> > > > >> > > > Yeah, We will pay attention to the test problems, which are related >> to >> > > > Flink K8S and we are happy to improve it. ;-) >> > > > >> > > > BR, >> > > > >> > > > Bo Zhao >> > > > >> > > > Martijn Visser <martijnvis...@apache.org> 于2022年7月13日周三 15:19写道: >> > > > >> > > > > Hi all, >> > > > > >> > > > > Thanks for the FLIP. I have a couple of remarks/questions: >> > > > > >> > > > > * Regarding the motivation, it mentions that the development >> trend is >> > > > that >> > > > > Flink supports both batch and stream processing. I think the >> vision >> > and >> > > > > trend is that we have unified batch- and stream processing. What >> I'm >> > > > > missing is the vision on what's the impact for customized >> Kubernetes >> > > > > schedulers on stream processing. Could there be some elaboration >> on >> > > that? >> > > > > * While the FLIP talks about customized schedulers, it focuses on >> > > > Volcano. >> > > > > Why is the choice made to only focus on Volcano and not on other >> > > > schedulers >> > > > > like Apache YuniKorn? Can we not also provide an implementation >> for >> > > > > YuniKorn at the same time? Spark did the same with SPARK-36057 [1] >> > > > > * We still have quite a lot of tech debt on testing for Kubernetes >> > > [2]. I >> > > > > think that this FLIP would be a great improvement for Flink, but >> I am >> > > > > worried that we will add more tech debt to those tests. Can we >> > somehow >> > > > > improve this situation? >> > > > > >> > > > > Best regards, >> > > > > >> > > > > Martijn >> > > > > >> > > > > [1] https://issues.apache.org/jira/browse/SPARK-36057 >> > > > > [2] https://issues.apache.org/jira/browse/FLINK-20392 >> > > > > >> > > > > Op wo 13 jul. 2022 om 04:11 schreef 王正 <cswangzh...@gmail.com>: >> > > > > >> > > > > > +1 >> > > > > > >> > > > > > On 2022/07/07 01:15:13 bo zhaobo wrote: >> > > > > > > Hi, all. >> > > > > > > >> > > > > > > I would like to raise a discussion in Flink dev ML about >> Support >> > > > > > Customized >> > > > > > > Kubernetes Schedulers. >> > > > > > > Currentlly, Kubernetes becomes more and more polular for Flink >> > > > Cluster >> > > > > > > deployment, and its ability is better, especially, it supports >> > > > > > customized >> > > > > > > scheduling. >> > > > > > > Essentially, in high-performance workloads, we need to apply >> new >> > > > > > scheduling >> > > > > > > policies for meeting the new requirements. And now Flink >> native >> > > > > > Kubernetes >> > > > > > > solution is using Kubernetes default scheduler to work with >> all >> > > > > > scenarios, >> > > > > > > the default scheduling policy might be difficult to apply in >> some >> > > > > extreme >> > > > > > > cases, so >> > > > > > > we need to improve the Flink Kubernetes for coupling those >> > > Kubernetes >> > > > > > > customized schedulers with Flink native Kubernetes, provides a >> > way >> > > > for >> > > > > > Flink >> > > > > > > administrators or users to use/apply their Flink Clusters on >> > > > Kubernetes >> > > > > > > more flexibility. >> > > > > > > >> > > > > > > The proposal will introduce the customized K8S schdulers >> plugin >> > > > > mechanism >> > > > > > > and a reference implementation 'Volcano' in Flink. More >> details >> > see >> > > > > [1]. >> > > > > > > >> > > > > > > Looking forward to your feedback. >> > > > > > > >> > > > > > > [1] >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-250%3A+Support+Customized+Kubernetes+Schedulers+Proposal >> > > > > > > >> > > > > > > Thanks, >> > > > > > > BR >> > > > > > > >> > > > > >> > > > >> > > >> > >> >