Re: [DISCUSS] Support the session job management in kubernetes operator

Yang Wang Mon, 21 Mar 2022 19:11:41 -0700

I think the session cluster could not be deleted unless all the running
jobs have finished or cancelled. I agree this should be clarified in the
FLIP.


Best,
Yang

Thomas Weise <[email protected]> 于2022年3月22日周二 09:26写道：

> Hi Aitozi,
>
> Thanks for the proposal. Can you please clarify in the FLIP the
> relationship between the session deployment and the jobs that depend on it?
> Will, for example, the operator ensure that the individual jobs are
> deleted when the underlying cluster is deleted?
>
> Side note: When the discussion thread started 5 days ago and a FLIP vote
> was started 2 days later and there is also a weekend included, then this is
> probably on the short side for broader feedback.
>
> Thanks,
> Thomas
>
>
> On Fri, Mar 18, 2022 at 4:01 AM Yang Wang <[email protected]> wrote:
>
> > Great work. Since we are introducing a new public API, it deserves a
> FLIP.
> > And the FLIP will help the later contributors catch up soon.
> >
> > Best,
> > Yang
> >
> > Gyula Fóra <[email protected]> 于2022年3月18日周五 18:11写道：
> >
> > > Thank Aitozi, a FLIP might be an overkill at this point but no harm in
> > > voting on it anyways :)
> > >
> > > Looks good!
> > >
> > > Gyula
> > >
> > > On Fri, Mar 18, 2022 at 10:25 AM Aitozi <[email protected]> wrote:
> > >
> > > > Hi Guys:
> > > >
> > > >     FYI, I have integrated your comments and drawn the FLIP-215[1], I
> > > will
> > > > create another thread to vote for it.
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-215%3A+Introduce+FlinkSessionJob+CRD+in+the+kubernetes+operator
> > > >
> > > > Best,
> > > >
> > > > Aitozi.
> > > >
> > > >
> > > > Aitozi <[email protected]> 于2022年3月17日周四 11:16写道：
> > > >
> > > > > Hi Biao Geng:
> > > > >
> > > > >    Thanks for your feedback, I'm +1 to go with option#2. It's a
> good
> > > > > point that
> > > > >
> > > > > we should improve the error message debugging for the session job,
> I
> > > > > think
> > > > >
> > > > > it can be a follow up work as an improvement after we support the
> > > session
> > > > > job operation.
> > > > >
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Aitozi.
> > > > >
> > > > >
> > > > > Geng Biao <[email protected]> 于2022年3月17日周四 10:55写道：
> > > > >
> > > > >> Thanks Aitozi for the work!
> > > > >>
> > > > >> I lean to option#2 of using JarRunHeaders with uber job jar as
> well.
> > > As
> > > > >> Yang said, the user defined dependencies may be better supported
> in
> > > > >> upstream flink.
> > > > >> A follow-up thought: I think we should care the  potential
> influence
> > > on
> > > > >> user experiences: as the job graph is generated in JM, when the
> > > > generation
> > > > >> fails due to some issues in the main() method, we should do some
> > work
> > > on
> > > > >> showing such error messages in this proposal or the later k8s
> > operator
> > > > >> implementation.  Reason for this question is that if users submit
> > many
> > > > jobs
> > > > >> to one same session cluster, it may be not easy for them to find
> > > > relevant
> > > > >> error logs about main() method of a specific job. The FLINK-25715
> > > could
> > > > >> help us later.
> > > > >>
> > > > >>
> > > > >> Best,
> > > > >> Biao Geng
> > > > >>
> > > > >>
> > > > >> 发件人: Aitozi <[email protected]>
> > > > >> 日期: 星期三, 2022年3月16日 下午5:19
> > > > >> 收件人: [email protected] <[email protected]>
> > > > >> 主题: Re: [DISCUSS] Support the session job management in kubernetes
> > > > >> operator
> > > > >> Hi Yang Wang
> > > > >>     Thanks for your feedback, Provide the local and http
> > > implementation
> > > > >> for
> > > > >> the first version makes sense to me.
> > > > >> +1 for it.
> > > > >>
> > > > >> Best,
> > > > >> Aitozi
> > > > >>
> > > > >> Yang Wang <[email protected]> 于2022年3月16日周三 16:44写道：
> > > > >>
> > > > >> > # How to download the user jars
> > > > >> > I agree with Gyula that it will be a burden if we bundle the
> flink
> > > > >> > filesystem dependencies in the operator image.
> > > > >> > Maybe we could have a *ArtifactFetcher* interface in the
> > > > >> > flink-kubernetes-operator. By default, we provide the local and
> > http
> > > > >> > implementation,
> > > > >> > which means we could get the user jars from local files or HTTP
> > > URLs.
> > > > >> Flink
> > > > >> > filesystem support could be done as a follow-up based on the
> > > feedback.
> > > > >> >
> > > > >> > If the user wants to use the local implementation, they need to
> > > mount
> > > > a
> > > > >> > PV(aka persist volume) to the operator first and then put their
> > jars
> > > > >> into
> > > > >> > the PV.
> > > > >> >
> > > > >> > # How to talk to session JobManager to submit the job
> > > > >> > After more consideration, I also prefer the second approach, via
> > > REST
> > > > >> API
> > > > >> > /jars/:jarid/run. If we have strong requirements to support
> > > > dependencies
> > > > >> > jars and
> > > > >> > artifacts, we could try to support this in the upstream project.
> > > > >> >
> > > > >> > Best,
> > > > >> > Yang
> > > > >> >
> > > > >> >
> > > > >> > Aitozi <[email protected]> 于2022年3月16日周三 16:11写道：
> > > > >> >
> > > > >> > > Hi Gyula
> > > > >> > >     Thanks for your quick response. Regarding the different
> > > > >> filesystems
> > > > >> > > dependency,
> > > > >> > > I think we can make it optional and pluggable, and let it
> choose
> > > by
> > > > >> user
> > > > >> > > when building
> > > > >> > > their operator image. Users can build their image from the
> base
> > > > >> operator
> > > > >> > > image and
> > > > >> > > add filesystem dependency they want to use to it. BTW, we can
> > > > support
> > > > >> the
> > > > >> > > http URI
> > > > >> > > by default.
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Aitozi.
> > > > >> > >
> > > > >> > > Gyula Fóra <[email protected]> 于2022年3月16日周三 15:53写道：
> > > > >> > >
> > > > >> > > > Thank you Aitozi!
> > > > >> > > >
> > > > >> > > > I think this will be a very nice (and simple) addition to
> > enable
> > > > >> these
> > > > >> > > > use-cases.
> > > > >> > > >
> > > > >> > > > I have 2 comments regarding the proposal:
> > > > >> > > >
> > > > >> > > > 1. I think if we want to support different filesystems to
> > > download
> > > > >> jars
> > > > >> > > > from, we probably need some clever ways to add external
> > operator
> > > > >> > > > dependencies (jars, configs).
> > > > >> > > > I would prefer not to bundle them into the base operator
> > image.
> > > > >> > > >
> > > > >> > > > 2. I think we should avoid creating the jobgraphs on the
> > > operator
> > > > >> side
> > > > >> > > and
> > > > >> > > > use the jar upload/run rest api instead as you suggested.
> This
> > > > will
> > > > >> > avoid
> > > > >> > > > flink version and dependency conflicts.
> > > > >> > > >
> > > > >> > > > Cheers,
> > > > >> > > > Gyula
> > > > >> > > >
> > > > >> > > > On Wed, Mar 16, 2022 at 8:41 AM Aitozi <
> [email protected]>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > Hi Guys:
> > > > >> > > > >
> > > > >> > > > >     I would like to open a discussion for support session
> > job
> > > > >> > > management
> > > > >> > > > in
> > > > >> > > > > kubernetes operator. It’s intended to enhance the
> > > > >> > > > flink-kubernetes-operator
> > > > >> > > > > to manage the session job with k8s tooling. I have drafted
> > the
> > > > >> design
> > > > >> > > > > doc[1]. Please refer to it and give me some feedback .
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > [1]
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1WPGbur1eT3H_5gN-kyXfp7EDjdbJUURx6jN8nt6UT-s/edit#
> > > > >> <
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1WPGbur1eT3H_5gN-kyXfp7EDjdbJUURx6jN8nt6UT-s/edit
> > > > >> >
> > > > >> > > > >
> > > > >> > > > > Best,
> > > > >> > > > >
> > > > >> > > > > Aitozi.
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Support the session job management in kubernetes operator

Reply via email to