Also from my side +1 to start voting. Cheers, Kostas
On Thu, Mar 5, 2020 at 7:45 AM tison <wander4...@gmail.com> wrote: > > +1 to star voting. > > Best, > tison. > > > Yang Wang <danrtsey...@gmail.com> 于2020年3月5日周四 下午2:29写道: >> >> Hi Peter, >> Really thanks for your response. >> >> Hi all @Kostas Kloudas @Zili Chen @Peter Huang @Rong Rong >> It seems that we have reached an agreement. The “application mode” is >> regarded as the enhanced “per-job”. It is >> orthogonal with “cluster deploy”. Currently, we bind the “per-job” to >> `run-user-main-on-client` and “application mode” >> to `run-user-main-on-cluster`. >> >> Do you have other concerns to moving FLIP-85 to voting? >> >> >> Best, >> Yang >> >> Peter Huang <huangzhenqiu0...@gmail.com> 于2020年3月5日周四 下午12:48写道: >>> >>> Hi Yang and Kostas, >>> >>> Thanks for the clarification. It makes more sense to me if the long term >>> goal is to replace per job mode to application mode >>> in the future (at the time that multiple execute can be supported). Before >>> that, It will be better to keep the concept of >>> application mode internally. As Yang suggested, User only need to use a >>> `-R/-- remote-deploy` cli option to launch >>> a per job cluster with the main function executed in cluster entry-point. >>> +1 for the execution plan. >>> >>> >>> >>> Best Regards >>> Peter Huang >>> >>> >>> >>> >>> On Tue, Mar 3, 2020 at 7:11 AM Yang Wang <danrtsey...@gmail.com> wrote: >>>> >>>> Hi Peter, >>>> >>>> Having the application mode does not mean we will drop the cluster-deploy >>>> option. I just want to share some thoughts about “Application Mode”. >>>> >>>> >>>> 1. The application mode could cover the per-job sematic. Its lifecyle is >>>> bound >>>> to the user `main()`. And all the jobs in the user main will be executed >>>> in a same >>>> Flink cluster. In first phase of FLIP-85 implementation, running user main >>>> on the >>>> cluster side could be supported in application mode. >>>> >>>> 2. Maybe in the future, we also need to support multiple `execute()` on >>>> client side >>>> in a same Flink cluster. Then the per-job mode will evolve to application >>>> mode. >>>> >>>> 3. From user perspective, only a `-R/-- remote-deploy` cli option is >>>> visible. They >>>> are not aware of the application mode. >>>> >>>> 4. In the first phase, the application mode is working as “per-job”(only >>>> one job in >>>> the user main). We just leave more potential for the future. >>>> >>>> >>>> I am not against with calling it “cluster deploy mode” if you all think it >>>> is clearer for users. >>>> >>>> >>>> >>>> Best, >>>> Yang >>>> >>>> Kostas Kloudas <kklou...@gmail.com> 于2020年3月3日周二 下午6:49写道: >>>>> >>>>> Hi Peter, >>>>> >>>>> I understand your point. This is why I was also a bit torn about the >>>>> name and my proposal was a bit aligned with yours (something along the >>>>> lines of "cluster deploy" mode). >>>>> >>>>> But many of the other participants in the discussion suggested the >>>>> "Application Mode". I think that the reasoning is that now the user's >>>>> Application is more self-contained. >>>>> It will be submitted to the cluster and the user can just disconnect. >>>>> In addition, as discussed briefly in the doc, in the future there may >>>>> be better support for multi-execute applications which will bring us >>>>> one step closer to the true "Application Mode". But this is how I >>>>> interpreted their arguments, of course they can also express their >>>>> thoughts on the topic :) >>>>> >>>>> Cheers, >>>>> Kostas >>>>> >>>>> On Mon, Mar 2, 2020 at 6:15 PM Peter Huang <huangzhenqiu0...@gmail.com> >>>>> wrote: >>>>> > >>>>> > Hi Kostas, >>>>> > >>>>> > Thanks for updating the wiki. We have aligned with the implementations >>>>> > in the doc. But I feel it is still a little bit confusing of the naming >>>>> > from a user's perspective. It is well known that Flink support per job >>>>> > cluster and session cluster. The concept is in the layer of how a job >>>>> > is managed within Flink. The method introduced util now is a kind of >>>>> > mixing job and session cluster to promising the implementation >>>>> > complexity. We probably don't need to label it as Application Model as >>>>> > the same layer of per job cluster and session cluster. Conceptually, I >>>>> > think it is still a cluster mode implementation for per job cluster. >>>>> > >>>>> > To minimize the confusion of users, I think it would be better just an >>>>> > option of per job cluster for each type of cluster manager. How do you >>>>> > think? >>>>> > >>>>> > >>>>> > Best Regards >>>>> > Peter Huang >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > On Mon, Mar 2, 2020 at 7:22 AM Kostas Kloudas <kklou...@gmail.com> >>>>> > wrote: >>>>> >> >>>>> >> Hi Yang, >>>>> >> >>>>> >> The difference between per-job and application mode is that, as you >>>>> >> described, in the per-job mode the main is executed on the client >>>>> >> while in the application mode, the main is executed on the cluster. >>>>> >> I do not think we have to offer "application mode" with running the >>>>> >> main on the client side as this is exactly what the per-job mode does >>>>> >> currently and, as you described also, it would be redundant. >>>>> >> >>>>> >> Sorry if this was not clear in the document. >>>>> >> >>>>> >> Cheers, >>>>> >> Kostas >>>>> >> >>>>> >> On Mon, Mar 2, 2020 at 3:17 PM Yang Wang <danrtsey...@gmail.com> wrote: >>>>> >> > >>>>> >> > Hi Kostas, >>>>> >> > >>>>> >> > Thanks a lot for your conclusion and updating the FLIP-85 WIKI. >>>>> >> > Currently, i have no more >>>>> >> > questions about motivation, approach, fault tolerance and the first >>>>> >> > phase implementation. >>>>> >> > >>>>> >> > I think the new title "Flink Application Mode" makes a lot senses to >>>>> >> > me. Especially for the >>>>> >> > containerized environment, the cluster deploy option will be very >>>>> >> > useful. >>>>> >> > >>>>> >> > Just one concern, how do we introduce this new application mode to >>>>> >> > our users? >>>>> >> > Each user program(i.e. `main()`) is an application. Currently, we >>>>> >> > intend to only support one >>>>> >> > `execute()`. So what's the difference between per-job and >>>>> >> > application mode? >>>>> >> > >>>>> >> > For per-job, user `main()` is always executed on client side. And >>>>> >> > For application mode, user >>>>> >> > `main()` could be executed on client or master side(configured via >>>>> >> > cli option). >>>>> >> > Right? We need to have a clear concept. Otherwise, the users will be >>>>> >> > more and more confusing. >>>>> >> > >>>>> >> > >>>>> >> > Best, >>>>> >> > Yang >>>>> >> > >>>>> >> > Kostas Kloudas <kklou...@gmail.com> 于2020年3月2日周一 下午5:58写道: >>>>> >> >> >>>>> >> >> Hi all, >>>>> >> >> >>>>> >> >> I update >>>>> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode >>>>> >> >> based on the discussion we had here: >>>>> >> >> >>>>> >> >> https://docs.google.com/document/d/1ji72s3FD9DYUyGuKnJoO4ApzV-nSsZa0-bceGXW7Ocw/edit# >>>>> >> >> >>>>> >> >> Please let me know what you think and please keep the discussion in >>>>> >> >> the ML :) >>>>> >> >> >>>>> >> >> Thanks for starting the discussion and I hope that soon we will be >>>>> >> >> able to vote on the FLIP. >>>>> >> >> >>>>> >> >> Cheers, >>>>> >> >> Kostas >>>>> >> >> >>>>> >> >> On Thu, Jan 16, 2020 at 3:40 AM Yang Wang <danrtsey...@gmail.com> >>>>> >> >> wrote: >>>>> >> >> > >>>>> >> >> > Hi all, >>>>> >> >> > >>>>> >> >> > Thanks a lot for the feedback from @Kostas Kloudas. Your all >>>>> >> >> > concerns are >>>>> >> >> > on point. The FLIP-85 is mainly >>>>> >> >> > focused on supporting cluster mode for per-job. Since it is more >>>>> >> >> > urgent and >>>>> >> >> > have much more use >>>>> >> >> > cases both in Yarn and Kubernetes deployment. For session >>>>> >> >> > cluster, we could >>>>> >> >> > have more discussion >>>>> >> >> > in a new thread later. >>>>> >> >> > >>>>> >> >> > #1, How to download the user jars and dependencies for per-job in >>>>> >> >> > cluster >>>>> >> >> > mode? >>>>> >> >> > For Yarn, we could register the user jars and dependencies as >>>>> >> >> > LocalResource. They will be distributed >>>>> >> >> > by Yarn. And once the JobManager and TaskManager launched, the >>>>> >> >> > jars are >>>>> >> >> > already exists. >>>>> >> >> > For Standalone per-job and K8s, we expect that the user jars >>>>> >> >> > and dependencies are built into the image. >>>>> >> >> > Or the InitContainer could be used for downloading. It is natively >>>>> >> >> > distributed and we will not have bottleneck. >>>>> >> >> > >>>>> >> >> > #2, Job graph recovery >>>>> >> >> > We could have an optimization to store job graph on the DFS. >>>>> >> >> > However, i >>>>> >> >> > suggest building a new jobgraph >>>>> >> >> > from the configuration is the default option. Since we will not >>>>> >> >> > always have >>>>> >> >> > a DFS store when deploying a >>>>> >> >> > Flink per-job cluster. Of course, we assume that using the same >>>>> >> >> > configuration(e.g. job_id, user_jar, main_class, >>>>> >> >> > main_args, parallelism, savepoint_settings, etc.) will get a same >>>>> >> >> > job >>>>> >> >> > graph. I think the standalone per-job >>>>> >> >> > already has the similar behavior. >>>>> >> >> > >>>>> >> >> > #3, What happens with jobs that have multiple execute calls? >>>>> >> >> > Currently, it is really a problem. Even we use a local client on >>>>> >> >> > Flink >>>>> >> >> > master side, it will have different behavior with >>>>> >> >> > client mode. For client mode, if we execute multiple times, then >>>>> >> >> > we will >>>>> >> >> > deploy multiple Flink clusters for each execute. >>>>> >> >> > I am not pretty sure whether it is reasonable. However, i still >>>>> >> >> > think using >>>>> >> >> > the local client is a good choice. We could >>>>> >> >> > continue the discussion in a new thread. @Zili Chen >>>>> >> >> > <wander4...@gmail.com> Do >>>>> >> >> > you want to drive this? >>>>> >> >> > >>>>> >> >> > >>>>> >> >> > >>>>> >> >> > Best, >>>>> >> >> > Yang >>>>> >> >> > >>>>> >> >> > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月16日周四 上午1:55写道: >>>>> >> >> > >>>>> >> >> > > Hi Kostas, >>>>> >> >> > > >>>>> >> >> > > Thanks for this feedback. I can't agree more about the opinion. >>>>> >> >> > > The >>>>> >> >> > > cluster mode should be added >>>>> >> >> > > first in per job cluster. >>>>> >> >> > > >>>>> >> >> > > 1) For job cluster implementation >>>>> >> >> > > 1. Job graph recovery from configuration or store as static job >>>>> >> >> > > graph as >>>>> >> >> > > session cluster. I think the static one will be better for less >>>>> >> >> > > recovery >>>>> >> >> > > time. >>>>> >> >> > > Let me update the doc for details. >>>>> >> >> > > >>>>> >> >> > > 2. For job execute multiple times, I think @Zili Chen >>>>> >> >> > > <wander4...@gmail.com> has proposed the local client solution >>>>> >> >> > > that can >>>>> >> >> > > the run program actually in the cluster entry point. We can put >>>>> >> >> > > the >>>>> >> >> > > implementation in the second stage, >>>>> >> >> > > or even a new FLIP for further discussion. >>>>> >> >> > > >>>>> >> >> > > 2) For session cluster implementation >>>>> >> >> > > We can disable the cluster mode for the session cluster in the >>>>> >> >> > > first >>>>> >> >> > > stage. I agree the jar downloading will be a painful thing. >>>>> >> >> > > We can consider about PoC and performance evaluation first. If >>>>> >> >> > > the end to >>>>> >> >> > > end experience is good enough, then we can consider >>>>> >> >> > > proceeding with the solution. >>>>> >> >> > > >>>>> >> >> > > Looking forward to more opinions from @Yang Wang >>>>> >> >> > > <danrtsey...@gmail.com> @Zili >>>>> >> >> > > Chen <wander4...@gmail.com> @Dian Fu <dian0511...@gmail.com>. >>>>> >> >> > > >>>>> >> >> > > >>>>> >> >> > > Best Regards >>>>> >> >> > > Peter Huang >>>>> >> >> > > >>>>> >> >> > > On Wed, Jan 15, 2020 at 7:50 AM Kostas Kloudas >>>>> >> >> > > <kklou...@gmail.com> wrote: >>>>> >> >> > > >>>>> >> >> > >> Hi all, >>>>> >> >> > >> >>>>> >> >> > >> I am writing here as the discussion on the Google Doc seems to >>>>> >> >> > >> be a >>>>> >> >> > >> bit difficult to follow. >>>>> >> >> > >> >>>>> >> >> > >> I think that in order to be able to make progress, it would be >>>>> >> >> > >> helpful >>>>> >> >> > >> to focus on per-job mode for now. >>>>> >> >> > >> The reason is that: >>>>> >> >> > >> 1) making the (unique) JobSubmitHandler responsible for >>>>> >> >> > >> creating the >>>>> >> >> > >> jobgraphs, >>>>> >> >> > >> which includes downloading dependencies, is not an optimal >>>>> >> >> > >> solution >>>>> >> >> > >> 2) even if we put the responsibility on the JobMaster, >>>>> >> >> > >> currently each >>>>> >> >> > >> job has its own >>>>> >> >> > >> JobMaster but they all run on the same process, so we have >>>>> >> >> > >> again a >>>>> >> >> > >> single entity. >>>>> >> >> > >> >>>>> >> >> > >> Of course after this is done, and if we feel comfortable with >>>>> >> >> > >> the >>>>> >> >> > >> solution, then we can go to the session mode. >>>>> >> >> > >> >>>>> >> >> > >> A second comment has to do with fault-tolerance in the per-job, >>>>> >> >> > >> cluster-deploy mode. >>>>> >> >> > >> In the document, it is suggested that upon recovery, the >>>>> >> >> > >> JobMaster of >>>>> >> >> > >> each job re-creates the JobGraph. >>>>> >> >> > >> I am just wondering if it is better to create and store the >>>>> >> >> > >> jobGraph >>>>> >> >> > >> upon submission and only fetch it >>>>> >> >> > >> upon recovery so that we have a static jobGraph. >>>>> >> >> > >> >>>>> >> >> > >> Finally, I have a question which is what happens with jobs >>>>> >> >> > >> that have >>>>> >> >> > >> multiple execute calls? >>>>> >> >> > >> The semantics seem to change compared to the current >>>>> >> >> > >> behaviour, right? >>>>> >> >> > >> >>>>> >> >> > >> Cheers, >>>>> >> >> > >> Kostas >>>>> >> >> > >> >>>>> >> >> > >> On Wed, Jan 8, 2020 at 8:05 PM tison <wander4...@gmail.com> >>>>> >> >> > >> wrote: >>>>> >> >> > >> > >>>>> >> >> > >> > not always, Yang Wang is also not yet a committer but he can >>>>> >> >> > >> > join the >>>>> >> >> > >> > channel. I cannot find the id by clicking “Add new member in >>>>> >> >> > >> > channel” so >>>>> >> >> > >> > come to you and ask for try out the link. Possibly I will >>>>> >> >> > >> > find other >>>>> >> >> > >> ways >>>>> >> >> > >> > but the original purpose is that the slack channel is a >>>>> >> >> > >> > public area we >>>>> >> >> > >> > discuss about developing... >>>>> >> >> > >> > Best, >>>>> >> >> > >> > tison. >>>>> >> >> > >> > >>>>> >> >> > >> > >>>>> >> >> > >> > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月9日周四 >>>>> >> >> > >> > 上午2:44写道: >>>>> >> >> > >> > >>>>> >> >> > >> > > Hi Tison, >>>>> >> >> > >> > > >>>>> >> >> > >> > > I am not the committer of Flink yet. I think I can't join >>>>> >> >> > >> > > it also. >>>>> >> >> > >> > > >>>>> >> >> > >> > > >>>>> >> >> > >> > > Best Regards >>>>> >> >> > >> > > Peter Huang >>>>> >> >> > >> > > >>>>> >> >> > >> > > On Wed, Jan 8, 2020 at 9:39 AM tison >>>>> >> >> > >> > > <wander4...@gmail.com> wrote: >>>>> >> >> > >> > > >>>>> >> >> > >> > > > Hi Peter, >>>>> >> >> > >> > > > >>>>> >> >> > >> > > > Could you try out this link? >>>>> >> >> > >> > > https://the-asf.slack.com/messages/CNA3ADZPH >>>>> >> >> > >> > > > >>>>> >> >> > >> > > > Best, >>>>> >> >> > >> > > > tison. >>>>> >> >> > >> > > > >>>>> >> >> > >> > > > >>>>> >> >> > >> > > > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月9日周四 >>>>> >> >> > >> > > > 上午1:22写道: >>>>> >> >> > >> > > > >>>>> >> >> > >> > > > > Hi Tison, >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > > I can't join the group with shared link. Would you >>>>> >> >> > >> > > > > please add me >>>>> >> >> > >> into >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > group? My slack account is huangzhenqiu0825. >>>>> >> >> > >> > > > > Thank you in advance. >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > > Best Regards >>>>> >> >> > >> > > > > Peter Huang >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > > On Wed, Jan 8, 2020 at 12:02 AM tison >>>>> >> >> > >> > > > > <wander4...@gmail.com> >>>>> >> >> > >> wrote: >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > > > Hi Peter, >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > As described above, this effort should get attention >>>>> >> >> > >> > > > > > from people >>>>> >> >> > >> > > > > developing >>>>> >> >> > >> > > > > > FLIP-73 a.k.a. Executor abstractions. I recommend >>>>> >> >> > >> > > > > > you to join >>>>> >> >> > >> the >>>>> >> >> > >> > > > public >>>>> >> >> > >> > > > > > slack channel[1] for Flink Client API Enhancement >>>>> >> >> > >> > > > > > and you can >>>>> >> >> > >> try to >>>>> >> >> > >> > > > > share >>>>> >> >> > >> > > > > > you detailed thoughts there. It possibly gets more >>>>> >> >> > >> > > > > > concrete >>>>> >> >> > >> > > attentions. >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > Best, >>>>> >> >> > >> > > > > > tison. >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > [1] >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > >>>>> >> >> > >> > > >>>>> >> >> > >> https://slack.com/share/IS21SJ75H/Rk8HhUly9FuEHb7oGwBZ33uL/enQtODg2MDYwNjE5MTg3LTA2MjIzNDc1M2ZjZDVlMjdlZjk1M2RkYmJhNjAwMTk2ZDZkODQ4NmY5YmI4OGRhNWJkYTViMTM1NzlmMzc4OWM >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > Peter Huang <huangzhenqiu0...@gmail.com> >>>>> >> >> > >> > > > > > 于2020年1月7日周二 上午5:09写道: >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > > Dear All, >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > Happy new year! According to existing feedback >>>>> >> >> > >> > > > > > > from the >>>>> >> >> > >> community, >>>>> >> >> > >> > > we >>>>> >> >> > >> > > > > > > revised the doc with the consideration of session >>>>> >> >> > >> > > > > > > cluster >>>>> >> >> > >> support, >>>>> >> >> > >> > > > and >>>>> >> >> > >> > > > > > > concrete interface changes needed and execution >>>>> >> >> > >> > > > > > > plan. Please >>>>> >> >> > >> take >>>>> >> >> > >> > > one >>>>> >> >> > >> > > > > > more >>>>> >> >> > >> > > > > > > round of review at your most convenient time. >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > >>>>> >> >> > >> > > >>>>> >> >> > >> https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edit# >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > Best Regards >>>>> >> >> > >> > > > > > > Peter Huang >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > On Thu, Jan 2, 2020 at 11:29 AM Peter Huang < >>>>> >> >> > >> > > > > huangzhenqiu0...@gmail.com> >>>>> >> >> > >> > > > > > > wrote: >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > > > Hi Dian, >>>>> >> >> > >> > > > > > > > Thanks for giving us valuable feedbacks. >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > 1) It's better to have a whole design for this >>>>> >> >> > >> > > > > > > > feature >>>>> >> >> > >> > > > > > > > For the suggestion of enabling the cluster mode >>>>> >> >> > >> > > > > > > > also session >>>>> >> >> > >> > > > > cluster, I >>>>> >> >> > >> > > > > > > > think Flink already supported it. >>>>> >> >> > >> > > > > > > > WebSubmissionExtension >>>>> >> >> > >> already >>>>> >> >> > >> > > > > allows >>>>> >> >> > >> > > > > > > > users to start a job with the specified jar by >>>>> >> >> > >> > > > > > > > using web UI. >>>>> >> >> > >> > > > > > > > But we need to enable the feature from CLI for >>>>> >> >> > >> > > > > > > > both local >>>>> >> >> > >> jar, >>>>> >> >> > >> > > > remote >>>>> >> >> > >> > > > > > > jar. >>>>> >> >> > >> > > > > > > > I will align with Yang Wang first about the >>>>> >> >> > >> > > > > > > > details and >>>>> >> >> > >> update >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > > design >>>>> >> >> > >> > > > > > > > doc. >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > 2) It's better to consider the convenience for >>>>> >> >> > >> > > > > > > > users, such >>>>> >> >> > >> as >>>>> >> >> > >> > > > > debugging >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > I am wondering whether we can store the >>>>> >> >> > >> > > > > > > > exception in >>>>> >> >> > >> jobgragh >>>>> >> >> > >> > > > > > > > generation in application master. As no >>>>> >> >> > >> > > > > > > > streaming graph can >>>>> >> >> > >> be >>>>> >> >> > >> > > > > > scheduled >>>>> >> >> > >> > > > > > > in >>>>> >> >> > >> > > > > > > > this case, there will be no more TM will be >>>>> >> >> > >> > > > > > > > requested from >>>>> >> >> > >> > > FlinkRM. >>>>> >> >> > >> > > > > > > > If the AM is still running, users can still >>>>> >> >> > >> > > > > > > > query it from >>>>> >> >> > >> CLI. As >>>>> >> >> > >> > > > it >>>>> >> >> > >> > > > > > > > requires more change, we can get some feedback >>>>> >> >> > >> > > > > > > > from < >>>>> >> >> > >> > > > > > aljos...@apache.org >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > and @zjf...@gmail.com <zjf...@gmail.com>. >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > 3) It's better to consider the impact to the >>>>> >> >> > >> > > > > > > > stability of >>>>> >> >> > >> the >>>>> >> >> > >> > > > cluster >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > I agree with Yang Wang's opinion. >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > Best Regards >>>>> >> >> > >> > > > > > > > Peter Huang >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > > On Sun, Dec 29, 2019 at 9:44 PM Dian Fu < >>>>> >> >> > >> dian0511...@gmail.com> >>>>> >> >> > >> > > > > wrote: >>>>> >> >> > >> > > > > > > > >>>>> >> >> > >> > > > > > > >> Hi all, >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> Sorry to jump into this discussion. Thanks >>>>> >> >> > >> > > > > > > >> everyone for the >>>>> >> >> > >> > > > > > discussion. >>>>> >> >> > >> > > > > > > >> I'm very interested in this topic although I'm >>>>> >> >> > >> > > > > > > >> not an >>>>> >> >> > >> expert in >>>>> >> >> > >> > > > this >>>>> >> >> > >> > > > > > > part. >>>>> >> >> > >> > > > > > > >> So I'm glad to share my thoughts as following: >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> 1) It's better to have a whole design for this >>>>> >> >> > >> > > > > > > >> feature >>>>> >> >> > >> > > > > > > >> As we know, there are two deployment modes: >>>>> >> >> > >> > > > > > > >> per-job mode >>>>> >> >> > >> and >>>>> >> >> > >> > > > session >>>>> >> >> > >> > > > > > > >> mode. I'm wondering which mode really needs >>>>> >> >> > >> > > > > > > >> this feature. >>>>> >> >> > >> As the >>>>> >> >> > >> > > > > > design >>>>> >> >> > >> > > > > > > doc >>>>> >> >> > >> > > > > > > >> mentioned, per-job mode is more used for >>>>> >> >> > >> > > > > > > >> streaming jobs and >>>>> >> >> > >> > > > session >>>>> >> >> > >> > > > > > > mode is >>>>> >> >> > >> > > > > > > >> usually used for batch jobs(Of course, the job >>>>> >> >> > >> > > > > > > >> types and >>>>> >> >> > >> the >>>>> >> >> > >> > > > > > deployment >>>>> >> >> > >> > > > > > > >> modes are orthogonal). Usually streaming job is >>>>> >> >> > >> > > > > > > >> only >>>>> >> >> > >> needed to >>>>> >> >> > >> > > be >>>>> >> >> > >> > > > > > > submitted >>>>> >> >> > >> > > > > > > >> once and it will run for days or weeks, while >>>>> >> >> > >> > > > > > > >> batch jobs >>>>> >> >> > >> will be >>>>> >> >> > >> > > > > > > submitted >>>>> >> >> > >> > > > > > > >> more frequently compared with streaming jobs. >>>>> >> >> > >> > > > > > > >> This means >>>>> >> >> > >> that >>>>> >> >> > >> > > > maybe >>>>> >> >> > >> > > > > > > session >>>>> >> >> > >> > > > > > > >> mode also needs this feature. However, if we >>>>> >> >> > >> > > > > > > >> support this >>>>> >> >> > >> > > feature >>>>> >> >> > >> > > > in >>>>> >> >> > >> > > > > > > >> session mode, the application master will >>>>> >> >> > >> > > > > > > >> become the new >>>>> >> >> > >> > > > centralized >>>>> >> >> > >> > > > > > > >> service(which should be solved). So in this >>>>> >> >> > >> > > > > > > >> case, it's >>>>> >> >> > >> better to >>>>> >> >> > >> > > > > have >>>>> >> >> > >> > > > > > a >>>>> >> >> > >> > > > > > > >> complete design for both per-job mode and >>>>> >> >> > >> > > > > > > >> session mode. >>>>> >> >> > >> > > > Furthermore, >>>>> >> >> > >> > > > > > > even >>>>> >> >> > >> > > > > > > >> if we can do it phase by phase, we need to have >>>>> >> >> > >> > > > > > > >> a whole >>>>> >> >> > >> picture >>>>> >> >> > >> > > of >>>>> >> >> > >> > > > > how >>>>> >> >> > >> > > > > > > it >>>>> >> >> > >> > > > > > > >> works in both per-job mode and session mode. >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> 2) It's better to consider the convenience for >>>>> >> >> > >> > > > > > > >> users, such >>>>> >> >> > >> as >>>>> >> >> > >> > > > > > debugging >>>>> >> >> > >> > > > > > > >> After we finish this feature, the job graph >>>>> >> >> > >> > > > > > > >> will be >>>>> >> >> > >> compiled in >>>>> >> >> > >> > > > the >>>>> >> >> > >> > > > > > > >> application master, which means that users >>>>> >> >> > >> > > > > > > >> cannot easily >>>>> >> >> > >> get the >>>>> >> >> > >> > > > > > > exception >>>>> >> >> > >> > > > > > > >> message synchorousely in the job client if >>>>> >> >> > >> > > > > > > >> there are >>>>> >> >> > >> problems >>>>> >> >> > >> > > > during >>>>> >> >> > >> > > > > > the >>>>> >> >> > >> > > > > > > >> job graph compiling (especially for platform >>>>> >> >> > >> > > > > > > >> users), such >>>>> >> >> > >> as the >>>>> >> >> > >> > > > > > > resource >>>>> >> >> > >> > > > > > > >> path is incorrect, the user program itself has >>>>> >> >> > >> > > > > > > >> some >>>>> >> >> > >> problems, >>>>> >> >> > >> > > etc. >>>>> >> >> > >> > > > > > What >>>>> >> >> > >> > > > > > > I'm >>>>> >> >> > >> > > > > > > >> thinking is that maybe we should throw the >>>>> >> >> > >> > > > > > > >> exceptions as >>>>> >> >> > >> early >>>>> >> >> > >> > > as >>>>> >> >> > >> > > > > > > possible >>>>> >> >> > >> > > > > > > >> (during job submission stage). >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> 3) It's better to consider the impact to the >>>>> >> >> > >> > > > > > > >> stability of >>>>> >> >> > >> the >>>>> >> >> > >> > > > > cluster >>>>> >> >> > >> > > > > > > >> If we perform the compiling in the application >>>>> >> >> > >> > > > > > > >> master, we >>>>> >> >> > >> should >>>>> >> >> > >> > > > > > > consider >>>>> >> >> > >> > > > > > > >> the impact of the compiling errors. Although >>>>> >> >> > >> > > > > > > >> YARN could >>>>> >> >> > >> resume >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > > > >> application master in case of failures, but in >>>>> >> >> > >> > > > > > > >> some case >>>>> >> >> > >> the >>>>> >> >> > >> > > > > compiling >>>>> >> >> > >> > > > > > > >> failure may be a waste of cluster resource and >>>>> >> >> > >> > > > > > > >> may impact >>>>> >> >> > >> the >>>>> >> >> > >> > > > > > stability >>>>> >> >> > >> > > > > > > the >>>>> >> >> > >> > > > > > > >> cluster and the other jobs in the cluster, such >>>>> >> >> > >> > > > > > > >> as the >>>>> >> >> > >> resource >>>>> >> >> > >> > > > path >>>>> >> >> > >> > > > > > is >>>>> >> >> > >> > > > > > > >> incorrect, the user program itself has some >>>>> >> >> > >> > > > > > > >> problems(in >>>>> >> >> > >> this >>>>> >> >> > >> > > case, >>>>> >> >> > >> > > > > job >>>>> >> >> > >> > > > > > > >> failover cannot solve this kind of problems) >>>>> >> >> > >> > > > > > > >> etc. In the >>>>> >> >> > >> current >>>>> >> >> > >> > > > > > > >> implemention, the compiling errors are handled >>>>> >> >> > >> > > > > > > >> in the >>>>> >> >> > >> client >>>>> >> >> > >> > > side >>>>> >> >> > >> > > > > and >>>>> >> >> > >> > > > > > > there >>>>> >> >> > >> > > > > > > >> is no impact to the cluster at all. >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> Regarding to 1), it's clearly pointed in the >>>>> >> >> > >> > > > > > > >> design doc >>>>> >> >> > >> that >>>>> >> >> > >> > > only >>>>> >> >> > >> > > > > > > per-job >>>>> >> >> > >> > > > > > > >> mode will be supported. However, I think it's >>>>> >> >> > >> > > > > > > >> better to >>>>> >> >> > >> also >>>>> >> >> > >> > > > > consider >>>>> >> >> > >> > > > > > > the >>>>> >> >> > >> > > > > > > >> session mode in the design doc. >>>>> >> >> > >> > > > > > > >> Regarding to 2) and 3), I have not seen related >>>>> >> >> > >> > > > > > > >> sections >>>>> >> >> > >> in the >>>>> >> >> > >> > > > > design >>>>> >> >> > >> > > > > > > >> doc. It will be good if we can cover them in >>>>> >> >> > >> > > > > > > >> the design >>>>> >> >> > >> doc. >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> Feel free to correct me If there is anything I >>>>> >> >> > >> misunderstand. >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> Regards, >>>>> >> >> > >> > > > > > > >> Dian >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> > 在 2019年12月27日,上午3:13,Peter Huang < >>>>> >> >> > >> huangzhenqiu0...@gmail.com> >>>>> >> >> > >> > > > 写道: >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > Hi Yang, >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > I can't agree more. The effort definitely >>>>> >> >> > >> > > > > > > >> > needs to align >>>>> >> >> > >> with >>>>> >> >> > >> > > > the >>>>> >> >> > >> > > > > > > final >>>>> >> >> > >> > > > > > > >> > goal of FLIP-73. >>>>> >> >> > >> > > > > > > >> > I am thinking about whether we can achieve >>>>> >> >> > >> > > > > > > >> > the goal with >>>>> >> >> > >> two >>>>> >> >> > >> > > > > phases. >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > 1) Phase I >>>>> >> >> > >> > > > > > > >> > As the CLiFrontend will not be depreciated >>>>> >> >> > >> > > > > > > >> > soon. We can >>>>> >> >> > >> still >>>>> >> >> > >> > > > use >>>>> >> >> > >> > > > > > the >>>>> >> >> > >> > > > > > > >> > deployMode flag there, >>>>> >> >> > >> > > > > > > >> > pass the program info through Flink >>>>> >> >> > >> > > > > > > >> > configuration, use >>>>> >> >> > >> the >>>>> >> >> > >> > > > > > > >> > ClassPathJobGraphRetriever >>>>> >> >> > >> > > > > > > >> > to generate the job graph in >>>>> >> >> > >> > > > > > > >> > ClusterEntrypoints of yarn >>>>> >> >> > >> and >>>>> >> >> > >> > > > > > > Kubernetes. >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > 2) Phase II >>>>> >> >> > >> > > > > > > >> > In AbstractJobClusterExecutor, the job graph >>>>> >> >> > >> > > > > > > >> > is >>>>> >> >> > >> generated in >>>>> >> >> > >> > > > the >>>>> >> >> > >> > > > > > > >> execute >>>>> >> >> > >> > > > > > > >> > function. We can still >>>>> >> >> > >> > > > > > > >> > use the deployMode in it. With deployMode = >>>>> >> >> > >> > > > > > > >> > cluster, the >>>>> >> >> > >> > > execute >>>>> >> >> > >> > > > > > > >> function >>>>> >> >> > >> > > > > > > >> > only starts the cluster. >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > When >>>>> >> >> > >> > > > > > > >> > {Yarn/Kuberneates}PerJobClusterEntrypoint >>>>> >> >> > >> > > > > > > >> > starts, >>>>> >> >> > >> It will >>>>> >> >> > >> > > > > start >>>>> >> >> > >> > > > > > > the >>>>> >> >> > >> > > > > > > >> > dispatch first, then we can use >>>>> >> >> > >> > > > > > > >> > a ClusterEnvironment similar to >>>>> >> >> > >> > > > > > > >> > ContextEnvironment to >>>>> >> >> > >> submit >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > job >>>>> >> >> > >> > > > > > > >> with >>>>> >> >> > >> > > > > > > >> > jobName the local >>>>> >> >> > >> > > > > > > >> > dispatcher. For the details, we need more >>>>> >> >> > >> > > > > > > >> > investigation. >>>>> >> >> > >> Let's >>>>> >> >> > >> > > > > wait >>>>> >> >> > >> > > > > > > >> > for @Aljoscha >>>>> >> >> > >> > > > > > > >> > Krettek <aljos...@apache.org> @Till Rohrmann < >>>>> >> >> > >> > > > > trohrm...@apache.org >>>>> >> >> > >> > > > > > >'s >>>>> >> >> > >> > > > > > > >> > feedback after the holiday season. >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > Thank you in advance. Merry Chrismas and >>>>> >> >> > >> > > > > > > >> > Happy New >>>>> >> >> > >> Year!!! >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > Best Regards >>>>> >> >> > >> > > > > > > >> > Peter Huang >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> > On Wed, Dec 25, 2019 at 1:08 AM Yang Wang < >>>>> >> >> > >> > > > danrtsey...@gmail.com> >>>>> >> >> > >> > > > > > > >> wrote: >>>>> >> >> > >> > > > > > > >> > >>>>> >> >> > >> > > > > > > >> >> Hi Peter, >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> I think we need to reconsider tison's >>>>> >> >> > >> > > > > > > >> >> suggestion >>>>> >> >> > >> seriously. >>>>> >> >> > >> > > > After >>>>> >> >> > >> > > > > > > >> FLIP-73, >>>>> >> >> > >> > > > > > > >> >> the deployJobCluster has >>>>> >> >> > >> > > > > > > >> >> beenmoved into `JobClusterExecutor#execute`. >>>>> >> >> > >> > > > > > > >> >> It should >>>>> >> >> > >> not be >>>>> >> >> > >> > > > > > > perceived >>>>> >> >> > >> > > > > > > >> >> for `CliFrontend`. That >>>>> >> >> > >> > > > > > > >> >> means the user program will *ALWAYS* be >>>>> >> >> > >> > > > > > > >> >> executed on >>>>> >> >> > >> client >>>>> >> >> > >> > > > side. >>>>> >> >> > >> > > > > > This >>>>> >> >> > >> > > > > > > >> is >>>>> >> >> > >> > > > > > > >> >> the by design behavior. >>>>> >> >> > >> > > > > > > >> >> So, we could not just add `if(client mode) >>>>> >> >> > >> > > > > > > >> >> .. else >>>>> >> >> > >> if(cluster >>>>> >> >> > >> > > > > mode) >>>>> >> >> > >> > > > > > > >> ...` >>>>> >> >> > >> > > > > > > >> >> codes in `CliFrontend` to bypass >>>>> >> >> > >> > > > > > > >> >> the executor. We need to find a clean way to >>>>> >> >> > >> > > > > > > >> >> decouple >>>>> >> >> > >> > > executing >>>>> >> >> > >> > > > > > user >>>>> >> >> > >> > > > > > > >> >> program and deploying per-job >>>>> >> >> > >> > > > > > > >> >> cluster. Based on this, we could support to >>>>> >> >> > >> > > > > > > >> >> execute user >>>>> >> >> > >> > > > program >>>>> >> >> > >> > > > > on >>>>> >> >> > >> > > > > > > >> client >>>>> >> >> > >> > > > > > > >> >> or master side. >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> Maybe Aljoscha and Jeff could give some good >>>>> >> >> > >> suggestions. >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> Best, >>>>> >> >> > >> > > > > > > >> >> Yang >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >> Peter Huang <huangzhenqiu0...@gmail.com> >>>>> >> >> > >> > > > > > > >> >> 于2019年12月25日周三 >>>>> >> >> > >> > > > > 上午4:03写道: >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >>> Hi Jingjing, >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>> The improvement proposed is a deployment >>>>> >> >> > >> > > > > > > >> >>> option for >>>>> >> >> > >> CLI. For >>>>> >> >> > >> > > > SQL >>>>> >> >> > >> > > > > > > based >>>>> >> >> > >> > > > > > > >> >>> Flink application, It is more convenient to >>>>> >> >> > >> > > > > > > >> >>> use the >>>>> >> >> > >> existing >>>>> >> >> > >> > > > > model >>>>> >> >> > >> > > > > > > in >>>>> >> >> > >> > > > > > > >> >>> SqlClient in which >>>>> >> >> > >> > > > > > > >> >>> the job graph is generated within >>>>> >> >> > >> > > > > > > >> >>> SqlClient. After >>>>> >> >> > >> adding >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > > > delayed >>>>> >> >> > >> > > > > > > >> job >>>>> >> >> > >> > > > > > > >> >>> graph generation, I think there is no >>>>> >> >> > >> > > > > > > >> >>> change is needed >>>>> >> >> > >> for >>>>> >> >> > >> > > > your >>>>> >> >> > >> > > > > > > side. >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>> Best Regards >>>>> >> >> > >> > > > > > > >> >>> Peter Huang >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>> On Wed, Dec 18, 2019 at 6:01 AM jingjing >>>>> >> >> > >> > > > > > > >> >>> bai < >>>>> >> >> > >> > > > > > > >> baijingjing7...@gmail.com> >>>>> >> >> > >> > > > > > > >> >>> wrote: >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>>> hi peter: >>>>> >> >> > >> > > > > > > >> >>>> we had extension SqlClent to support >>>>> >> >> > >> > > > > > > >> >>>> sql job >>>>> >> >> > >> submit in >>>>> >> >> > >> > > web >>>>> >> >> > >> > > > > > base >>>>> >> >> > >> > > > > > > on >>>>> >> >> > >> > > > > > > >> >>>> flink 1.9. we support submit to yarn on >>>>> >> >> > >> > > > > > > >> >>>> per job >>>>> >> >> > >> mode too. >>>>> >> >> > >> > > > > > > >> >>>> in this case, the job graph generated >>>>> >> >> > >> > > > > > > >> >>>> on client >>>>> >> >> > >> side >>>>> >> >> > >> > > . I >>>>> >> >> > >> > > > > > think >>>>> >> >> > >> > > > > > > >> >>> this >>>>> >> >> > >> > > > > > > >> >>>> discuss Mainly to improve api programme. >>>>> >> >> > >> > > > > > > >> >>>> but in my >>>>> >> >> > >> case , >>>>> >> >> > >> > > > > there >>>>> >> >> > >> > > > > > is >>>>> >> >> > >> > > > > > > >> no >>>>> >> >> > >> > > > > > > >> >>>> jar to upload but only a sql string . >>>>> >> >> > >> > > > > > > >> >>>> do u had more suggestion to improve for >>>>> >> >> > >> > > > > > > >> >>>> sql mode >>>>> >> >> > >> or it >>>>> >> >> > >> > > is >>>>> >> >> > >> > > > > > only a >>>>> >> >> > >> > > > > > > >> >>>> switch for api programme? >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>>> best >>>>> >> >> > >> > > > > > > >> >>>> bai jj >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>>> Yang Wang <danrtsey...@gmail.com> >>>>> >> >> > >> > > > > > > >> >>>> 于2019年12月18日周三 >>>>> >> >> > >> 下午7:21写道: >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>>>> I just want to revive this discussion. >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> Recently, i am thinking about how to >>>>> >> >> > >> > > > > > > >> >>>>> natively run >>>>> >> >> > >> flink >>>>> >> >> > >> > > > > per-job >>>>> >> >> > >> > > > > > > >> >>> cluster on >>>>> >> >> > >> > > > > > > >> >>>>> Kubernetes. >>>>> >> >> > >> > > > > > > >> >>>>> The per-job mode on Kubernetes is very >>>>> >> >> > >> > > > > > > >> >>>>> different >>>>> >> >> > >> from on >>>>> >> >> > >> > > > Yarn. >>>>> >> >> > >> > > > > > And >>>>> >> >> > >> > > > > > > >> we >>>>> >> >> > >> > > > > > > >> >>> will >>>>> >> >> > >> > > > > > > >> >>>>> have >>>>> >> >> > >> > > > > > > >> >>>>> the same deployment requirements to the >>>>> >> >> > >> > > > > > > >> >>>>> client and >>>>> >> >> > >> entry >>>>> >> >> > >> > > > > point. >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 1. Flink client not always need a local >>>>> >> >> > >> > > > > > > >> >>>>> jar to start >>>>> >> >> > >> a >>>>> >> >> > >> > > Flink >>>>> >> >> > >> > > > > > > per-job >>>>> >> >> > >> > > > > > > >> >>>>> cluster. We could >>>>> >> >> > >> > > > > > > >> >>>>> support multiple schemas. For example, >>>>> >> >> > >> > > > file:///path/of/my.jar >>>>> >> >> > >> > > > > > > means >>>>> >> >> > >> > > > > > > >> a >>>>> >> >> > >> > > > > > > >> >>> jar >>>>> >> >> > >> > > > > > > >> >>>>> located >>>>> >> >> > >> > > > > > > >> >>>>> at client side, >>>>> >> >> > >> hdfs://myhdfs/user/myname/flink/my.jar >>>>> >> >> > >> > > > means a >>>>> >> >> > >> > > > > > jar >>>>> >> >> > >> > > > > > > >> >>> located >>>>> >> >> > >> > > > > > > >> >>>>> at >>>>> >> >> > >> > > > > > > >> >>>>> remote hdfs, >>>>> >> >> > >> > > > > > > >> >>>>> local:///path/in/image/my.jar means a >>>>> >> >> > >> jar >>>>> >> >> > >> > > > located >>>>> >> >> > >> > > > > > at >>>>> >> >> > >> > > > > > > >> >>>>> jobmanager side. >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> 2. Support running user program on master >>>>> >> >> > >> > > > > > > >> >>>>> side. This >>>>> >> >> > >> also >>>>> >> >> > >> > > > > means >>>>> >> >> > >> > > > > > > the >>>>> >> >> > >> > > > > > > >> >>> entry >>>>> >> >> > >> > > > > > > >> >>>>> point >>>>> >> >> > >> > > > > > > >> >>>>> will generate the job graph on master >>>>> >> >> > >> > > > > > > >> >>>>> side. We could >>>>> >> >> > >> use >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > > > >> >>>>> ClasspathJobGraphRetriever >>>>> >> >> > >> > > > > > > >> >>>>> or start a local Flink client to achieve >>>>> >> >> > >> > > > > > > >> >>>>> this >>>>> >> >> > >> purpose. >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> cc tison, Aljoscha & Kostas Do you think >>>>> >> >> > >> > > > > > > >> >>>>> this is the >>>>> >> >> > >> right >>>>> >> >> > >> > > > > > > >> direction we >>>>> >> >> > >> > > > > > > >> >>>>> need to work? >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>> tison <wander4...@gmail.com> >>>>> >> >> > >> > > > > > > >> >>>>> 于2019年12月12日周四 >>>>> >> >> > >> 下午4:48写道: >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> A quick idea is that we separate the >>>>> >> >> > >> > > > > > > >> >>>>>> deployment >>>>> >> >> > >> from user >>>>> >> >> > >> > > > > > program >>>>> >> >> > >> > > > > > > >> >>> that >>>>> >> >> > >> > > > > > > >> >>>>> it >>>>> >> >> > >> > > > > > > >> >>>>>> has always been done >>>>> >> >> > >> > > > > > > >> >>>>>> outside the program. On user program >>>>> >> >> > >> > > > > > > >> >>>>>> executed there >>>>> >> >> > >> is >>>>> >> >> > >> > > > > always a >>>>> >> >> > >> > > > > > > >> >>>>>> ClusterClient that communicates with >>>>> >> >> > >> > > > > > > >> >>>>>> an existing cluster, remote or local. It >>>>> >> >> > >> > > > > > > >> >>>>>> will be >>>>> >> >> > >> another >>>>> >> >> > >> > > > > thread >>>>> >> >> > >> > > > > > > so >>>>> >> >> > >> > > > > > > >> >>> just >>>>> >> >> > >> > > > > > > >> >>>>> for >>>>> >> >> > >> > > > > > > >> >>>>>> your information. >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> Best, >>>>> >> >> > >> > > > > > > >> >>>>>> tison. >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> tison <wander4...@gmail.com> >>>>> >> >> > >> > > > > > > >> >>>>>> 于2019年12月12日周四 >>>>> >> >> > >> 下午4:40写道: >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Hi Peter, >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Another concern I realized recently is >>>>> >> >> > >> > > > > > > >> >>>>>>> that with >>>>> >> >> > >> current >>>>> >> >> > >> > > > > > > Executors >>>>> >> >> > >> > > > > > > >> >>>>>>> abstraction(FLIP-73) >>>>> >> >> > >> > > > > > > >> >>>>>>> I'm afraid that user program is >>>>> >> >> > >> > > > > > > >> >>>>>>> designed to ALWAYS >>>>> >> >> > >> run >>>>> >> >> > >> > > on >>>>> >> >> > >> > > > > the >>>>> >> >> > >> > > > > > > >> >>> client >>>>> >> >> > >> > > > > > > >> >>>>>> side. >>>>> >> >> > >> > > > > > > >> >>>>>>> Specifically, >>>>> >> >> > >> > > > > > > >> >>>>>>> we deploy the job in executor when >>>>> >> >> > >> > > > > > > >> >>>>>>> env.execute >>>>> >> >> > >> called. >>>>> >> >> > >> > > > This >>>>> >> >> > >> > > > > > > >> >>>>> abstraction >>>>> >> >> > >> > > > > > > >> >>>>>>> possibly prevents >>>>> >> >> > >> > > > > > > >> >>>>>>> Flink runs user program on the cluster >>>>> >> >> > >> > > > > > > >> >>>>>>> side. >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> For your proposal, in this case we >>>>> >> >> > >> > > > > > > >> >>>>>>> already >>>>> >> >> > >> compiled the >>>>> >> >> > >> > > > > > program >>>>> >> >> > >> > > > > > > >> and >>>>> >> >> > >> > > > > > > >> >>>>> run >>>>> >> >> > >> > > > > > > >> >>>>>> on >>>>> >> >> > >> > > > > > > >> >>>>>>> the client side, >>>>> >> >> > >> > > > > > > >> >>>>>>> even we deploy a cluster and retrieve >>>>> >> >> > >> > > > > > > >> >>>>>>> job graph >>>>> >> >> > >> from >>>>> >> >> > >> > > > program >>>>> >> >> > >> > > > > > > >> >>>>> metadata, it >>>>> >> >> > >> > > > > > > >> >>>>>>> doesn't make >>>>> >> >> > >> > > > > > > >> >>>>>>> many sense. >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> cc Aljoscha & Kostas what do you think >>>>> >> >> > >> > > > > > > >> >>>>>>> about this >>>>> >> >> > >> > > > > constraint? >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Best, >>>>> >> >> > >> > > > > > > >> >>>>>>> tison. >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> Peter Huang <huangzhenqiu0...@gmail.com> >>>>> >> >> > >> 于2019年12月10日周二 >>>>> >> >> > >> > > > > > > >> 下午12:45写道: >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Hi Tison, >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Yes, you are right. I think I made the >>>>> >> >> > >> > > > > > > >> >>>>>>>> wrong >>>>> >> >> > >> argument >>>>> >> >> > >> > > in >>>>> >> >> > >> > > > > the >>>>> >> >> > >> > > > > > > doc. >>>>> >> >> > >> > > > > > > >> >>>>>>>> Basically, the packaging jar problem >>>>> >> >> > >> > > > > > > >> >>>>>>>> is only for >>>>> >> >> > >> > > platform >>>>> >> >> > >> > > > > > > users. >>>>> >> >> > >> > > > > > > >> >>> In >>>>> >> >> > >> > > > > > > >> >>>>> our >>>>> >> >> > >> > > > > > > >> >>>>>>>> internal deploy service, >>>>> >> >> > >> > > > > > > >> >>>>>>>> we further optimized the deployment >>>>> >> >> > >> > > > > > > >> >>>>>>>> latency by >>>>> >> >> > >> letting >>>>> >> >> > >> > > > > users >>>>> >> >> > >> > > > > > to >>>>> >> >> > >> > > > > > > >> >>>>>> packaging >>>>> >> >> > >> > > > > > > >> >>>>>>>> flink-runtime together with the uber >>>>> >> >> > >> > > > > > > >> >>>>>>>> jar, so that >>>>> >> >> > >> we >>>>> >> >> > >> > > > don't >>>>> >> >> > >> > > > > > need >>>>> >> >> > >> > > > > > > >> to >>>>> >> >> > >> > > > > > > >> >>>>>>>> consider >>>>> >> >> > >> > > > > > > >> >>>>>>>> multiple flink version >>>>> >> >> > >> > > > > > > >> >>>>>>>> support for now. In the session client >>>>> >> >> > >> > > > > > > >> >>>>>>>> mode, as >>>>> >> >> > >> Flink >>>>> >> >> > >> > > > libs >>>>> >> >> > >> > > > > > will >>>>> >> >> > >> > > > > > > >> be >>>>> >> >> > >> > > > > > > >> >>>>>> shipped >>>>> >> >> > >> > > > > > > >> >>>>>>>> anyway as local resources of yarn. >>>>> >> >> > >> > > > > > > >> >>>>>>>> Users actually >>>>> >> >> > >> don't >>>>> >> >> > >> > > > > need >>>>> >> >> > >> > > > > > to >>>>> >> >> > >> > > > > > > >> >>>>> package >>>>> >> >> > >> > > > > > > >> >>>>>>>> those libs into job jar. >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> Best Regards >>>>> >> >> > >> > > > > > > >> >>>>>>>> Peter Huang >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> On Mon, Dec 9, 2019 at 8:35 PM tison < >>>>> >> >> > >> > > > wander4...@gmail.com >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > > > >> >>> wrote: >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> package? Do users >>>>> >> >> > >> need >>>>> >> >> > >> > > to >>>>> >> >> > >> > > > > > > >> >>> compile >>>>> >> >> > >> > > > > > > >> >>>>>> their >>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars >>>>> >> >> > >> > > > > > > >> >>>>>>>>> inlcuding flink-clients, >>>>> >> >> > >> > > > > > > >> >>>>>>>>> flink-optimizer, >>>>> >> >> > >> flink-table >>>>> >> >> > >> > > > > codes? >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> The answer should be no because they >>>>> >> >> > >> > > > > > > >> >>>>>>>>> exist in >>>>> >> >> > >> system >>>>> >> >> > >> > > > > > > classpath. >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> Best, >>>>> >> >> > >> > > > > > > >> >>>>>>>>> tison. >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> Yang Wang <danrtsey...@gmail.com> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> 于2019年12月10日周二 >>>>> >> >> > >> > > > > 下午12:18写道: >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Hi Peter, >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Thanks a lot for starting this >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> discussion. I >>>>> >> >> > >> think >>>>> >> >> > >> > > this >>>>> >> >> > >> > > > > is >>>>> >> >> > >> > > > > > a >>>>> >> >> > >> > > > > > > >> >>> very >>>>> >> >> > >> > > > > > > >> >>>>>>>> useful >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> feature. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Not only for Yarn, i am focused on >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink on >>>>> >> >> > >> > > Kubernetes >>>>> >> >> > >> > > > > > > >> >>>>> integration >>>>> >> >> > >> > > > > > > >> >>>>>> and >>>>> >> >> > >> > > > > > > >> >>>>>>>>> come >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> across the same >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> problem. I do not want the job graph >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> generated >>>>> >> >> > >> on >>>>> >> >> > >> > > > client >>>>> >> >> > >> > > > > > > side. >>>>> >> >> > >> > > > > > > >> >>>>>>>> Instead, >>>>> >> >> > >> > > > > > > >> >>>>>>>>> the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> user jars are built in >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> a user-defined image. When the job >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> manager >>>>> >> >> > >> launched, >>>>> >> >> > >> > > we >>>>> >> >> > >> > > > > > just >>>>> >> >> > >> > > > > > > >> >>>>> need to >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> generate the job graph >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> based on local user jars. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> I have some small suggestion about >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> this. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 1. `ProgramJobGraphRetriever` is >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> very similar to >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ClasspathJobGraphRetriever`, the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> differences >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> are the former needs >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ProgramMetadata` and the >>>>> >> >> > >> latter >>>>> >> >> > >> > > > > needs >>>>> >> >> > >> > > > > > > >> >>> some >>>>> >> >> > >> > > > > > > >> >>>>>>>>> arguments. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Is it possible to >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> have an unified `JobGraphRetriever` >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> to support >>>>> >> >> > >> both? >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 2. Is it possible to not use a local >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> user jar to >>>>> >> >> > >> > > start >>>>> >> >> > >> > > > a >>>>> >> >> > >> > > > > > > >> >>> per-job >>>>> >> >> > >> > > > > > > >> >>>>>>>> cluster? >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> In your case, the user jars has >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> existed on hdfs already and we do >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> need to >>>>> >> >> > >> download >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > jars >>>>> >> >> > >> > > > > > > to >>>>> >> >> > >> > > > > > > >> >>>>>>>> deployer >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> service. Currently, we >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> always need a local user jar to >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> start a flink >>>>> >> >> > >> > > cluster. >>>>> >> >> > >> > > > It >>>>> >> >> > >> > > > > > is >>>>> >> >> > >> > > > > > > >> >>> be >>>>> >> >> > >> > > > > > > >> >>>>>> great >>>>> >> >> > >> > > > > > > >> >>>>>>>> if >>>>> >> >> > >> > > > > > > >> >>>>>>>>> we >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> could support remote user jars. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> In the implementation, we assume >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> users package >>>>> >> >> > >> > > > > > > >> >>> flink-clients, >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer, flink-table >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> together within >>>>> >> >> > >> the job >>>>> >> >> > >> > > > jar. >>>>> >> >> > >> > > > > > > >> >>>>> Otherwise, >>>>> >> >> > >> > > > > > > >> >>>>>>>> the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> job graph generation within >>>>> >> >> > >> JobClusterEntryPoint will >>>>> >> >> > >> > > > > fail. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> package? Do users >>>>> >> >> > >> need >>>>> >> >> > >> > > to >>>>> >> >> > >> > > > > > > >> >>> compile >>>>> >> >> > >> > > > > > > >> >>>>>> their >>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> inlcuding flink-clients, >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer, >>>>> >> >> > >> flink-table >>>>> >> >> > >> > > > > > codes? >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Best, >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Yang >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Peter Huang >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> <huangzhenqiu0...@gmail.com> >>>>> >> >> > >> > > > 于2019年12月10日周二 >>>>> >> >> > >> > > > > > > >> >>>>> 上午2:37写道: >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Dear All, >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Recently, the Flink community >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> starts to >>>>> >> >> > >> improve the >>>>> >> >> > >> > > > yarn >>>>> >> >> > >> > > > > > > >> >>>>> cluster >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> descriptor >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> to make job jar and config files >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> configurable >>>>> >> >> > >> from >>>>> >> >> > >> > > > CLI. >>>>> >> >> > >> > > > > It >>>>> >> >> > >> > > > > > > >> >>>>>> improves >>>>> >> >> > >> > > > > > > >> >>>>>>>> the >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> flexibility of Flink deployment >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Yarn Per Job >>>>> >> >> > >> Mode. >>>>> >> >> > >> > > > For >>>>> >> >> > >> > > > > > > >> >>>>> platform >>>>> >> >> > >> > > > > > > >> >>>>>>>> users >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> who >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> manage tens of hundreds of >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> streaming pipelines >>>>> >> >> > >> for >>>>> >> >> > >> > > the >>>>> >> >> > >> > > > > > whole >>>>> >> >> > >> > > > > > > >> >>>>> org >>>>> >> >> > >> > > > > > > >> >>>>>> or >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> company, we found the job graph >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in >>>>> >> >> > >> > > > > client-side >>>>> >> >> > >> > > > > > is >>>>> >> >> > >> > > > > > > >> >>>>>> another >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> pinpoint. Thus, we want to propose a >>>>> >> >> > >> configurable >>>>> >> >> > >> > > > > feature >>>>> >> >> > >> > > > > > > >> >>> for >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FlinkYarnSessionCli. The feature >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> can allow >>>>> >> >> > >> users to >>>>> >> >> > >> > > > > choose >>>>> >> >> > >> > > > > > > >> >>> the >>>>> >> >> > >> > > > > > > >> >>>>> job >>>>> >> >> > >> > > > > > > >> >>>>>>>>> graph >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in Flink >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> ClusterEntryPoint so that >>>>> >> >> > >> the >>>>> >> >> > >> > > job >>>>> >> >> > >> > > > > jar >>>>> >> >> > >> > > > > > > >> >>>>> doesn't >>>>> >> >> > >> > > > > > > >> >>>>>>>> need >>>>> >> >> > >> > > > > > > >> >>>>>>>>> to >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> be locally for the job graph >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation. The >>>>> >> >> > >> > > proposal >>>>> >> >> > >> > > > is >>>>> >> >> > >> > > > > > > >> >>>>> organized >>>>> >> >> > >> > > > > > > >> >>>>>>>> as a >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FLIP >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > >>>>> >> >> > >> > > >>>>> >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> . >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Any questions and suggestions are >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> welcomed. >>>>> >> >> > >> Thank >>>>> >> >> > >> > > you >>>>> >> >> > >> > > > in >>>>> >> >> > >> > > > > > > >> >>>>> advance. >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Best Regards >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Peter Huang >>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>>> >>>>> >> >> > >> > > > > > > >> >>>>>> >>>>> >> >> > >> > > > > > > >> >>>>> >>>>> >> >> > >> > > > > > > >> >>>> >>>>> >> >> > >> > > > > > > >> >>> >>>>> >> >> > >> > > > > > > >> >> >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >> >>>>> >> >> > >> > > > > > > >>>>> >> >> > >> > > > > > >>>>> >> >> > >> > > > > >>>>> >> >> > >> > > > >>>>> >> >> > >> > > >>>>> >> >> > >> >>>>> >> >> > > >>>>> >> >>