I think the session cluster could not be deleted unless all the running jobs have finished or cancelled. I agree this should be clarified in the FLIP.
Best, Yang Thomas Weise <t...@apache.org> 于2022年3月22日周二 09:26写道: > Hi Aitozi, > > Thanks for the proposal. Can you please clarify in the FLIP the > relationship between the session deployment and the jobs that depend on it? > Will, for example, the operator ensure that the individual jobs are > deleted when the underlying cluster is deleted? > > Side note: When the discussion thread started 5 days ago and a FLIP vote > was started 2 days later and there is also a weekend included, then this is > probably on the short side for broader feedback. > > Thanks, > Thomas > > > On Fri, Mar 18, 2022 at 4:01 AM Yang Wang <danrtsey...@gmail.com> wrote: > > > Great work. Since we are introducing a new public API, it deserves a > FLIP. > > And the FLIP will help the later contributors catch up soon. > > > > Best, > > Yang > > > > Gyula Fóra <gyula.f...@gmail.com> 于2022年3月18日周五 18:11写道: > > > > > Thank Aitozi, a FLIP might be an overkill at this point but no harm in > > > voting on it anyways :) > > > > > > Looks good! > > > > > > Gyula > > > > > > On Fri, Mar 18, 2022 at 10:25 AM Aitozi <gjying1...@gmail.com> wrote: > > > > > > > Hi Guys: > > > > > > > > FYI, I have integrated your comments and drawn the FLIP-215[1], I > > > will > > > > create another thread to vote for it. > > > > > > > > [1]: > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-215%3A+Introduce+FlinkSessionJob+CRD+in+the+kubernetes+operator > > > > > > > > Best, > > > > > > > > Aitozi. > > > > > > > > > > > > Aitozi <gjying1...@gmail.com> 于2022年3月17日周四 11:16写道: > > > > > > > > > Hi Biao Geng: > > > > > > > > > > Thanks for your feedback, I'm +1 to go with option#2. It's a > good > > > > > point that > > > > > > > > > > we should improve the error message debugging for the session job, > I > > > > > think > > > > > > > > > > it can be a follow up work as an improvement after we support the > > > session > > > > > job operation. > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Aitozi. > > > > > > > > > > > > > > > Geng Biao <biaoge...@gmail.com> 于2022年3月17日周四 10:55写道: > > > > > > > > > >> Thanks Aitozi for the work! > > > > >> > > > > >> I lean to option#2 of using JarRunHeaders with uber job jar as > well. > > > As > > > > >> Yang said, the user defined dependencies may be better supported > in > > > > >> upstream flink. > > > > >> A follow-up thought: I think we should care the potential > influence > > > on > > > > >> user experiences: as the job graph is generated in JM, when the > > > > generation > > > > >> fails due to some issues in the main() method, we should do some > > work > > > on > > > > >> showing such error messages in this proposal or the later k8s > > operator > > > > >> implementation. Reason for this question is that if users submit > > many > > > > jobs > > > > >> to one same session cluster, it may be not easy for them to find > > > > relevant > > > > >> error logs about main() method of a specific job. The FLINK-25715 > > > could > > > > >> help us later. > > > > >> > > > > >> > > > > >> Best, > > > > >> Biao Geng > > > > >> > > > > >> > > > > >> 发件人: Aitozi <gjying1...@gmail.com> > > > > >> 日期: 星期三, 2022年3月16日 下午5:19 > > > > >> 收件人: dev@flink.apache.org <dev@flink.apache.org> > > > > >> 主题: Re: [DISCUSS] Support the session job management in kubernetes > > > > >> operator > > > > >> Hi Yang Wang > > > > >> Thanks for your feedback, Provide the local and http > > > implementation > > > > >> for > > > > >> the first version makes sense to me. > > > > >> +1 for it. > > > > >> > > > > >> Best, > > > > >> Aitozi > > > > >> > > > > >> Yang Wang <danrtsey...@gmail.com> 于2022年3月16日周三 16:44写道: > > > > >> > > > > >> > # How to download the user jars > > > > >> > I agree with Gyula that it will be a burden if we bundle the > flink > > > > >> > filesystem dependencies in the operator image. > > > > >> > Maybe we could have a *ArtifactFetcher* interface in the > > > > >> > flink-kubernetes-operator. By default, we provide the local and > > http > > > > >> > implementation, > > > > >> > which means we could get the user jars from local files or HTTP > > > URLs. > > > > >> Flink > > > > >> > filesystem support could be done as a follow-up based on the > > > feedback. > > > > >> > > > > > >> > If the user wants to use the local implementation, they need to > > > mount > > > > a > > > > >> > PV(aka persist volume) to the operator first and then put their > > jars > > > > >> into > > > > >> > the PV. > > > > >> > > > > > >> > # How to talk to session JobManager to submit the job > > > > >> > After more consideration, I also prefer the second approach, via > > > REST > > > > >> API > > > > >> > /jars/:jarid/run. If we have strong requirements to support > > > > dependencies > > > > >> > jars and > > > > >> > artifacts, we could try to support this in the upstream project. > > > > >> > > > > > >> > Best, > > > > >> > Yang > > > > >> > > > > > >> > > > > > >> > Aitozi <gjying1...@gmail.com> 于2022年3月16日周三 16:11写道: > > > > >> > > > > > >> > > Hi Gyula > > > > >> > > Thanks for your quick response. Regarding the different > > > > >> filesystems > > > > >> > > dependency, > > > > >> > > I think we can make it optional and pluggable, and let it > choose > > > by > > > > >> user > > > > >> > > when building > > > > >> > > their operator image. Users can build their image from the > base > > > > >> operator > > > > >> > > image and > > > > >> > > add filesystem dependency they want to use to it. BTW, we can > > > > support > > > > >> the > > > > >> > > http URI > > > > >> > > by default. > > > > >> > > > > > > >> > > Thanks, > > > > >> > > Aitozi. > > > > >> > > > > > > >> > > Gyula Fóra <gyula.f...@gmail.com> 于2022年3月16日周三 15:53写道: > > > > >> > > > > > > >> > > > Thank you Aitozi! > > > > >> > > > > > > > >> > > > I think this will be a very nice (and simple) addition to > > enable > > > > >> these > > > > >> > > > use-cases. > > > > >> > > > > > > > >> > > > I have 2 comments regarding the proposal: > > > > >> > > > > > > > >> > > > 1. I think if we want to support different filesystems to > > > download > > > > >> jars > > > > >> > > > from, we probably need some clever ways to add external > > operator > > > > >> > > > dependencies (jars, configs). > > > > >> > > > I would prefer not to bundle them into the base operator > > image. > > > > >> > > > > > > > >> > > > 2. I think we should avoid creating the jobgraphs on the > > > operator > > > > >> side > > > > >> > > and > > > > >> > > > use the jar upload/run rest api instead as you suggested. > This > > > > will > > > > >> > avoid > > > > >> > > > flink version and dependency conflicts. > > > > >> > > > > > > > >> > > > Cheers, > > > > >> > > > Gyula > > > > >> > > > > > > > >> > > > On Wed, Mar 16, 2022 at 8:41 AM Aitozi < > gjying1...@gmail.com> > > > > >> wrote: > > > > >> > > > > > > > >> > > > > Hi Guys: > > > > >> > > > > > > > > >> > > > > I would like to open a discussion for support session > > job > > > > >> > > management > > > > >> > > > in > > > > >> > > > > kubernetes operator. It’s intended to enhance the > > > > >> > > > flink-kubernetes-operator > > > > >> > > > > to manage the session job with k8s tooling. I have drafted > > the > > > > >> design > > > > >> > > > > doc[1]. Please refer to it and give me some feedback . > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > [1] > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://docs.google.com/document/d/1WPGbur1eT3H_5gN-kyXfp7EDjdbJUURx6jN8nt6UT-s/edit# > > > > >> < > > > > >> > > > > > > > > > > https://docs.google.com/document/d/1WPGbur1eT3H_5gN-kyXfp7EDjdbJUURx6jN8nt6UT-s/edit > > > > >> > > > > > >> > > > > > > > > >> > > > > Best, > > > > >> > > > > > > > > >> > > > > Aitozi. > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > >