Re: [DISCUSS] FLIP-250: Support Customized Kubernetes Schedulers Proposal

bo zhaobo Sun, 24 Jul 2022 18:05:12 -0700

Hi Marjtin,

Thank you very much that all clear. I will raise a Vote for this then. ;-)


BR

Bo Zhao

Martijn Visser <martijnvis...@apache.org> 于2022年7月24日周日 01:42写道：

> Hi all,
>
> Thanks a lot for clarifying Yikun! I have no more concerns.
>
> Best regards,
>
> Martijn
>
> Op vr 22 jul. 2022 om 10:42 schreef bo zhaobo <bzhaojyathousa...@gmail.com
> >:
>
> > Hi All,
> >
> > Thanks for all feedbacks from you. All of them are helpful and valuable
> for
> > us.
> >
> > If there is no further comment towards FLIP-250 we introduced, we plan to
> > setup a VOTE thread next Monday.
> >
> > Thank you all !!
> >
> > BR
> >
> > Bo Zhao
> >
> >
> > bo zhaobo <bzhaojyathousa...@gmail.com> 于2022年7月15日周五 10:02写道：
> >
> > > Thanks all, @Yang Wang and @Yikun Jiang.
> > >
> > > Hi Martijn,
> > >
> > > We understand your concern. And do the above emails clear your doubts?
> > >
> > > "
> > > Thanks for the info! I think I see that you've already updated the FLIP
> > to
> > > reflect how customized schedulers are beneficial for both batch and
> > > streaming jobs.
> > > "
> > >
> > > >>>
> > >
> > > Yeah, that's true that the "Motivation" paragraph makes readers
> confused.
> > > So
> > > I updated the FLIP description. And thanks for your feedback and
> correct.
> > >
> > > "
> > > The reason why I'm not too happy that we would only create a reference
> > > implementation for Volcano is that we don't know if the generic support
> > for
> > > customized scheduler plugins will also work for others. We think it
> will,
> > > but since there would be no other implementation available, we are not
> > > sure. My concern is that when someone tries to add support for another
> > > scheduler, we notice that we actually made a mistake or should improve
> > the
> > > generic support.
> > > "
> > >
> > > >>>
> > >
> > > Yeah, I understand your concern. Via YiKun Jinag's description and
> > > experience sharing,
> > > does he make you know more? Or we need to figure out the common part of
> > > some popular
> > > K8S customized schedulers and refresh the doc? Waiting for your advice.
> > > ;-)
> > >
> > > Best regards,
> > >
> > > Bo Zhao
> > >
> > > Yikun Jiang <yikunk...@gmail.com> 于2022年7月14日周四 18:45写道：
> > >
> > >> > And maybe we also could ping Yikun Jiang who has done similar things
> > in
> > >> Spark.
> > >>
> > >> Thanks for @wangyang ping. Yes, I was involved in Spark's customized
> > >> scheduler support work and as the main completer.
> > >>
> > >> For customized scheduler support, I can share scheduler's requirement
> in
> > >> here:
> > >>
> > >> 1. Help scheduler to *specify* the scheduler name
> > >>
> > >> 2. Help scheduler to create the* scheduler related
> > label/annotation/CRD*,
> > >> such as
> > >> - Yunikorn needs labels/annotations
> > >> <
> > >>
> >
> https://yunikorn.apache.org/docs/user_guide/labels_and_annotations_in_yunikorn/
> > >> >
> > >> (maybe task group CRD in future or not)
> > >> - Volcano needs annotations and CRD <
> > https://volcano.sh/en/docs/podgroup/
> > >> >
> > >> - Kube-batch needs annotations/CRD
> > >> <
> https://github.com/kubernetes-sigs/kube-batch/tree/master/config/crds>
> > >> - Kueue needs annotation support
> > >> <
> > >>
> >
> https://github.com/kubernetes-sigs/kueue/blob/888cedb6e62c315e008916086308a893cd21dd66/config/samples/sample-job.yaml#L6
> > >> >
> > >> and
> > >> cluster level CRD
> > >>
> > >> 3. Help the scheduler to create the scheduler meta/CRD at the* right
> > >> time*,
> > >> such as if users want to avoid pod max pending, we need to create the
> > >> scheduler required CRD before pod creation.
> > >>
> > >> For complex requirements, Spark uses featurestep to support (looks
> flink
> > >> decorators are very similar to it)
> > >> For simple requirements, they can just use configuration or Pod
> > Template.
> > >> [1]
> > >>
> > >>
> >
> https://spark.apache.org/docs/latest/running-on-kubernetes.html#customized-kubernetes-schedulers-for-spark-on-kubernetes
> > >>
> > >> From the FLIP, I can see the above requirements are covered.
> > >>
> > >> BTW, I think Flink decorators' existing and new added interface have
> > >> already covered all requirements of Kubernetes, so I personally think
> > the
> > >> K8s related scheduler requirement can also be well covered by it.
> > >>
> > >> Regards,
> > >> Yikun
> > >>
> > >>
> > >> On Thu, Jul 14, 2022 at 5:11 PM Yang Wang <danrtsey...@gmail.com>
> > wrote:
> > >>
> > >> > I think we could go over the customized scheduler plugin mechanism
> > again
> > >> > with YuniKorn to make sure that it is common enough.
> > >> > But the implementation could be deferred.
> > >> >
> > >> > And maybe we also could ping Yikun Jiang who has done similar things
> > in
> > >> > Spark.
> > >> >
> > >> > For the e2e tests, I admit that they could be improved. But I am not
> > >> sure
> > >> > whether we really need the java implementation instead.
> > >> > This is out of the scope of this FLIP and let's keep the discussion
> > >> > under FLINK-20392.
> > >> >
> > >> >
> > >> > Best,
> > >> > Yang
> > >> >
> > >> > Martijn Visser <martijnvis...@apache.org> 于2022年7月14日周四 15:28写道：
> > >> >
> > >> > > Hi Bo,
> > >> > >
> > >> > > Thanks for the info! I think I see that you've already updated the
> > >> FLIP
> > >> > to
> > >> > > reflect how customized schedulers are beneficial for both batch
> and
> > >> > > streaming jobs.
> > >> > >
> > >> > > The reason why I'm not too happy that we would only create a
> > reference
> > >> > > implementation for Volcano is that we don't know if the generic
> > >> support
> > >> > for
> > >> > > customized scheduler plugins will also work for others. We think
> it
> > >> will,
> > >> > > but since there would be no other implementation available, we are
> > not
> > >> > > sure. My concern is that when someone tries to add support for
> > another
> > >> > > scheduler, we notice that we actually made a mistake or should
> > improve
> > >> > the
> > >> > > generic support.
> > >> > >
> > >> > > Best regards,
> > >> > >
> > >> > > Martijn
> > >> > >
> > >> > >
> > >> > >
> > >> > > Op do 14 jul. 2022 om 05:30 schreef bo zhaobo <
> > >> > bzhaojyathousa...@gmail.com
> > >> > > >:
> > >> > >
> > >> > > > Hi Martijn,
> > >> > > >
> > >> > > > Thank you for your comments. I will answer the questions one by
> > one.
> > >> > > >
> > >> > > > ""
> > >> > > > * Regarding the motivation, it mentions that the development
> trend
> > >> is
> > >> > > that
> > >> > > > Flink supports both batch and stream processing. I think the
> > vision
> > >> and
> > >> > > > trend is that we have unified batch- and stream processing. What
> > I'm
> > >> > > > missing is the vision on what's the impact for customized
> > Kubernetes
> > >> > > > schedulers on stream processing. Could there be some elaboration
> > on
> > >> > that?
> > >> > > > ""
> > >> > > >
> > >> > > > >>
> > >> > > >
> > >> > > > We very much agree with you and the dev trend that Flink
> supports
> > >> both
> > >> > > > batch and stream processing. Actually, using the K8S customized
> > >> > scheduler
> > >> > > > is beneficial for streaming scenarios too, such as avoiding
> > resource
> > >> > > > deadlock and other problems, for example, the remaining
> resources
> > in
> > >> > the
> > >> > > > K8S cluster are only enough for one job running, but we
> submitted
> > >> two.
> > >> > At
> > >> > > > this time, both jobs will be prevented and hang from requesting
> > >> > resources
> > >> > > > at the same time when using the default K8S scheduler, but in
> this
> > >> > case,
> > >> > > > the customized scheduler Volcano won’t schedule overcommit pods
> if
> > >> the
> > >> > > idle
> > >> > > > can not fit all following pods setup. So the benefits mentioned
> in
> > >> FLIP
> > >> > > are
> > >> > > > not only for batch jobs. In fact, the said 4 scheduling
> > capabilities
> > >> > > > mentioned in FLIP are all required for stream processing. YARN
> has
> > >> some
> > >> > > of
> > >> > > > those scheduling features too, such as priority scheduling,
> > min/max
> > >> > > > resource constraint and etc...
> > >> > > >
> > >> > > > ""
> > >> > > > * While the FLIP talks about customized schedulers, it focuses
> on
> > >> > > Volcano.
> > >> > > > Why is the choice made to only focus on Volcano and not on other
> > >> > > schedulers
> > >> > > > like Apache YuniKorn? Can we not also provide an implementation
> > for
> > >> > > > YuniKorn at the same time? Spark did the same with SPARK-36057
> [1]
> > >> > > > ""
> > >> > > >
> > >> > > > >>
> > >> > > >
> > >> > > > Let's make it more clear about this. The FLIP consists of two
> > parts:
> > >> > > > 1. Introducing Flink K8S supports the customized scheduler
> plugin
> > >> > > > mechanism. This aspect is a general consideration.
> > >> > > > 2. Introducing ONE reference implementation for the customized
> > >> > scheduler,
> > >> > > > volcano is just one of them, if other schedulers or people are
> > >> > > interested,
> > >> > > > the integration of other schedulers can also be easily
> completed.
> > >> > > >
> > >> > > > ""
> > >> > > > * We still have quite a lot of tech debt on testing for
> Kubernetes
> > >> > [2]. I
> > >> > > > think that this FLIP would be a great improvement for Flink,
> but I
> > >> am
> > >> > > > worried that we will add more tech debt to those tests. Can we
> > >> somehow
> > >> > > > improve this situation?
> > >> > > > ""
> > >> > > >
> > >> > > > >>
> > >> > > >
> > >> > > > Yeah, We will pay attention to the test problems, which are
> > related
> > >> to
> > >> > > > Flink K8S and we are happy to improve it. ;-)
> > >> > > >
> > >> > > > BR,
> > >> > > >
> > >> > > > Bo Zhao
> > >> > > >
> > >> > > > Martijn Visser <martijnvis...@apache.org> 于2022年7月13日周三
> 15:19写道：
> > >> > > >
> > >> > > > > Hi all,
> > >> > > > >
> > >> > > > > Thanks for the FLIP. I have a couple of remarks/questions:
> > >> > > > >
> > >> > > > > * Regarding the motivation, it mentions that the development
> > >> trend is
> > >> > > > that
> > >> > > > > Flink supports both batch and stream processing. I think the
> > >> vision
> > >> > and
> > >> > > > > trend is that we have unified batch- and stream processing.
> What
> > >> I'm
> > >> > > > > missing is the vision on what's the impact for customized
> > >> Kubernetes
> > >> > > > > schedulers on stream processing. Could there be some
> elaboration
> > >> on
> > >> > > that?
> > >> > > > > * While the FLIP talks about customized schedulers, it focuses
> > on
> > >> > > > Volcano.
> > >> > > > > Why is the choice made to only focus on Volcano and not on
> other
> > >> > > > schedulers
> > >> > > > > like Apache YuniKorn? Can we not also provide an
> implementation
> > >> for
> > >> > > > > YuniKorn at the same time? Spark did the same with SPARK-36057
> > [1]
> > >> > > > > * We still have quite a lot of tech debt on testing for
> > Kubernetes
> > >> > > [2]. I
> > >> > > > > think that this FLIP would be a great improvement for Flink,
> but
> > >> I am
> > >> > > > > worried that we will add more tech debt to those tests. Can we
> > >> > somehow
> > >> > > > > improve this situation?
> > >> > > > >
> > >> > > > > Best regards,
> > >> > > > >
> > >> > > > > Martijn
> > >> > > > >
> > >> > > > > [1] https://issues.apache.org/jira/browse/SPARK-36057
> > >> > > > > [2] https://issues.apache.org/jira/browse/FLINK-20392
> > >> > > > >
> > >> > > > > Op wo 13 jul. 2022 om 04:11 schreef 王正 <cswangzh...@gmail.com
> >:
> > >> > > > >
> > >> > > > > > +1
> > >> > > > > >
> > >> > > > > > On 2022/07/07 01:15:13 bo zhaobo wrote:
> > >> > > > > > > Hi, all.
> > >> > > > > > >
> > >> > > > > > > I would like to raise a discussion in Flink dev ML about
> > >> Support
> > >> > > > > > Customized
> > >> > > > > > > Kubernetes Schedulers.
> > >> > > > > > > Currentlly, Kubernetes becomes more and more polular for
> > Flink
> > >> > > > Cluster
> > >> > > > > > > deployment, and its ability is better, especially, it
> > supports
> > >> > > > > > customized
> > >> > > > > > > scheduling.
> > >> > > > > > > Essentially, in high-performance workloads, we need to
> apply
> > >> new
> > >> > > > > > scheduling
> > >> > > > > > > policies for meeting the new requirements. And now Flink
> > >> native
> > >> > > > > > Kubernetes
> > >> > > > > > > solution is using Kubernetes default scheduler to work
> with
> > >> all
> > >> > > > > > scenarios,
> > >> > > > > > > the default scheduling policy might be difficult to apply
> in
> > >> some
> > >> > > > > extreme
> > >> > > > > > > cases, so
> > >> > > > > > > we need to improve the Flink Kubernetes for coupling those
> > >> > > Kubernetes
> > >> > > > > > > customized schedulers with Flink native Kubernetes,
> > provides a
> > >> > way
> > >> > > > for
> > >> > > > > > Flink
> > >> > > > > > > administrators or users to use/apply their Flink Clusters
> on
> > >> > > > Kubernetes
> > >> > > > > > > more flexibility.
> > >> > > > > > >
> > >> > > > > > > The proposal will introduce the customized K8S schdulers
> > >> plugin
> > >> > > > > mechanism
> > >> > > > > > > and a reference implementation 'Volcano' in Flink. More
> > >> details
> > >> > see
> > >> > > > > [1].
> > >> > > > > > >
> > >> > > > > > > Looking forward to your feedback.
> > >> > > > > > >
> > >> > > > > > > [1]
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-250%3A+Support+Customized+Kubernetes+Schedulers+Proposal
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > > BR
> > >> > > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] FLIP-250: Support Customized Kubernetes Schedulers Proposal

Reply via email to