Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

Kostas Kloudas Thu, 05 Mar 2020 06:09:54 -0800

Also from my side +1  to start voting.

Cheers,
Kostas


On Thu, Mar 5, 2020 at 7:45 AM tison <[email protected]> wrote:
>
> +1 to star voting.
>
> Best,
> tison.
>
>
> Yang Wang <[email protected]> 于2020年3月5日周四 下午2:29写道：
>>
>> Hi Peter,
>> Really thanks for your response.
>>
>> Hi all @Kostas Kloudas @Zili Chen @Peter Huang @Rong Rong
>> It seems that we have reached an agreement. The “application mode” is 
>> regarded as the enhanced “per-job”. It is
>> orthogonal with “cluster deploy”. Currently, we bind the “per-job” to 
>> `run-user-main-on-client` and “application mode”
>> to `run-user-main-on-cluster`.
>>
>> Do you have other concerns to moving FLIP-85 to voting?
>>
>>
>> Best,
>> Yang
>>
>> Peter Huang <[email protected]> 于2020年3月5日周四 下午12:48写道：
>>>
>>> Hi Yang and Kostas,
>>>
>>> Thanks for the clarification. It makes more sense to me if the long term 
>>> goal is to replace per job mode to application mode
>>>  in the future (at the time that multiple execute can be supported). Before 
>>> that, It will be better to keep the concept of
>>>  application mode internally. As Yang suggested, User only need to use a 
>>> `-R/-- remote-deploy` cli option to launch
>>> a per job cluster with the main function executed in cluster entry-point.  
>>> +1 for the execution plan.
>>>
>>>
>>>
>>> Best Regards
>>> Peter Huang
>>>
>>>
>>>
>>>
>>> On Tue, Mar 3, 2020 at 7:11 AM Yang Wang <[email protected]> wrote:
>>>>
>>>> Hi Peter,
>>>>
>>>> Having the application mode does not mean we will drop the cluster-deploy
>>>> option. I just want to share some thoughts about “Application Mode”.
>>>>
>>>>
>>>> 1. The application mode could cover the per-job sematic. Its lifecyle is 
>>>> bound
>>>> to the user `main()`. And all the jobs in the user main will be executed 
>>>> in a same
>>>> Flink cluster. In first phase of FLIP-85 implementation, running user main 
>>>> on the
>>>> cluster side could be supported in application mode.
>>>>
>>>> 2. Maybe in the future, we also need to support multiple `execute()` on 
>>>> client side
>>>> in a same Flink cluster. Then the per-job mode will evolve to application 
>>>> mode.
>>>>
>>>> 3. From user perspective, only a `-R/-- remote-deploy` cli option is 
>>>> visible. They
>>>> are not aware of the application mode.
>>>>
>>>> 4. In the first phase, the application mode is working as “per-job”(only 
>>>> one job in
>>>> the user main). We just leave more potential for the future.
>>>>
>>>>
>>>> I am not against with calling it “cluster deploy mode” if you all think it 
>>>> is clearer for users.
>>>>
>>>>
>>>>
>>>> Best,
>>>> Yang
>>>>
>>>> Kostas Kloudas <[email protected]> 于2020年3月3日周二 下午6:49写道：
>>>>>
>>>>> Hi Peter,
>>>>>
>>>>> I understand your point. This is why I was also a bit torn about the
>>>>> name and my proposal was a bit aligned with yours (something along the
>>>>> lines of "cluster deploy" mode).
>>>>>
>>>>> But many of the other participants in the discussion suggested the
>>>>> "Application Mode". I think that the reasoning is that now the user's
>>>>> Application is more self-contained.
>>>>> It will be submitted to the cluster and the user can just disconnect.
>>>>> In addition, as discussed briefly in the doc, in the future there may
>>>>> be better support for multi-execute applications which will bring us
>>>>> one step closer to the true "Application Mode". But this is how I
>>>>> interpreted their arguments, of course they can also express their
>>>>> thoughts on the topic :)
>>>>>
>>>>> Cheers,
>>>>> Kostas
>>>>>
>>>>> On Mon, Mar 2, 2020 at 6:15 PM Peter Huang <[email protected]> 
>>>>> wrote:
>>>>> >
>>>>> > Hi Kostas,
>>>>> >
>>>>> > Thanks for updating the wiki. We have aligned with the implementations 
>>>>> > in the doc. But I feel it is still a little bit confusing of the naming 
>>>>> > from a user's perspective. It is well known that Flink support per job 
>>>>> > cluster and session cluster. The concept is in the layer of how a job 
>>>>> > is managed within Flink. The method introduced util now is a kind of 
>>>>> > mixing job and session cluster to promising the implementation 
>>>>> > complexity. We probably don't need to label it as Application Model as 
>>>>> > the same layer of per job cluster and session cluster. Conceptually, I 
>>>>> > think it is still a cluster mode implementation for per job cluster.
>>>>> >
>>>>> > To minimize the confusion of users, I think it would be better just an 
>>>>> > option of per job cluster for each type of cluster manager. How do you 
>>>>> > think?
>>>>> >
>>>>> >
>>>>> > Best Regards
>>>>> > Peter Huang
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 2, 2020 at 7:22 AM Kostas Kloudas <[email protected]> 
>>>>> > wrote:
>>>>> >>
>>>>> >> Hi Yang,
>>>>> >>
>>>>> >> The difference between per-job and application mode is that, as you
>>>>> >> described, in the per-job mode the main is executed on the client
>>>>> >> while in the application mode, the main is executed on the cluster.
>>>>> >> I do not think we have to offer "application mode" with running the
>>>>> >> main on the client side as this is exactly what the per-job mode does
>>>>> >> currently and, as you described also, it would be redundant.
>>>>> >>
>>>>> >> Sorry if this was not clear in the document.
>>>>> >>
>>>>> >> Cheers,
>>>>> >> Kostas
>>>>> >>
>>>>> >> On Mon, Mar 2, 2020 at 3:17 PM Yang Wang <[email protected]> wrote:
>>>>> >> >
>>>>> >> > Hi Kostas,
>>>>> >> >
>>>>> >> > Thanks a lot for your conclusion and updating the FLIP-85 WIKI. 
>>>>> >> > Currently, i have no more
>>>>> >> > questions about motivation, approach, fault tolerance and the first 
>>>>> >> > phase implementation.
>>>>> >> >
>>>>> >> > I think the new title "Flink Application Mode" makes a lot senses to 
>>>>> >> > me. Especially for the
>>>>> >> > containerized environment, the cluster deploy option will be very 
>>>>> >> > useful.
>>>>> >> >
>>>>> >> > Just one concern, how do we introduce this new application mode to 
>>>>> >> > our users?
>>>>> >> > Each user program(i.e. `main()`) is an application. Currently, we 
>>>>> >> > intend to only support one
>>>>> >> > `execute()`. So what's the difference between per-job and 
>>>>> >> > application mode?
>>>>> >> >
>>>>> >> > For per-job, user `main()` is always executed on client side. And 
>>>>> >> > For application mode, user
>>>>> >> > `main()` could be executed on client or master side(configured via 
>>>>> >> > cli option).
>>>>> >> > Right? We need to have a clear concept. Otherwise, the users will be 
>>>>> >> > more and more confusing.
>>>>> >> >
>>>>> >> >
>>>>> >> > Best,
>>>>> >> > Yang
>>>>> >> >
>>>>> >> > Kostas Kloudas <[email protected]> 于2020年3月2日周一 下午5:58写道：
>>>>> >> >>
>>>>> >> >> Hi all,
>>>>> >> >>
>>>>> >> >> I update 
>>>>> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode
>>>>> >> >> based on the discussion we had here:
>>>>> >> >>
>>>>> >> >> https://docs.google.com/document/d/1ji72s3FD9DYUyGuKnJoO4ApzV-nSsZa0-bceGXW7Ocw/edit#
>>>>> >> >>
>>>>> >> >> Please let me know what you think and please keep the discussion in 
>>>>> >> >> the ML :)
>>>>> >> >>
>>>>> >> >> Thanks for starting the discussion and I hope that soon we will be
>>>>> >> >> able to vote on the FLIP.
>>>>> >> >>
>>>>> >> >> Cheers,
>>>>> >> >> Kostas
>>>>> >> >>
>>>>> >> >> On Thu, Jan 16, 2020 at 3:40 AM Yang Wang <[email protected]> 
>>>>> >> >> wrote:
>>>>> >> >> >
>>>>> >> >> > Hi all,
>>>>> >> >> >
>>>>> >> >> > Thanks a lot for the feedback from @Kostas Kloudas. Your all 
>>>>> >> >> > concerns are
>>>>> >> >> > on point. The FLIP-85 is mainly
>>>>> >> >> > focused on supporting cluster mode for per-job. Since it is more 
>>>>> >> >> > urgent and
>>>>> >> >> > have much more use
>>>>> >> >> > cases both in Yarn and Kubernetes deployment. For session 
>>>>> >> >> > cluster, we could
>>>>> >> >> > have more discussion
>>>>> >> >> > in a new thread later.
>>>>> >> >> >
>>>>> >> >> > #1, How to download the user jars and dependencies for per-job in 
>>>>> >> >> > cluster
>>>>> >> >> > mode?
>>>>> >> >> > For Yarn, we could register the user jars and dependencies as
>>>>> >> >> > LocalResource. They will be distributed
>>>>> >> >> > by Yarn. And once the JobManager and TaskManager launched, the 
>>>>> >> >> > jars are
>>>>> >> >> > already exists.
>>>>> >> >> > For Standalone per-job and K8s, we expect that the user jars
>>>>> >> >> > and dependencies are built into the image.
>>>>> >> >> > Or the InitContainer could be used for downloading. It is natively
>>>>> >> >> > distributed and we will not have bottleneck.
>>>>> >> >> >
>>>>> >> >> > #2, Job graph recovery
>>>>> >> >> > We could have an optimization to store job graph on the DFS. 
>>>>> >> >> > However, i
>>>>> >> >> > suggest building a new jobgraph
>>>>> >> >> > from the configuration is the default option. Since we will not 
>>>>> >> >> > always have
>>>>> >> >> > a DFS store when deploying a
>>>>> >> >> > Flink per-job cluster. Of course, we assume that using the same
>>>>> >> >> > configuration(e.g. job_id, user_jar, main_class,
>>>>> >> >> > main_args, parallelism, savepoint_settings, etc.) will get a same 
>>>>> >> >> > job
>>>>> >> >> > graph. I think the standalone per-job
>>>>> >> >> > already has the similar behavior.
>>>>> >> >> >
>>>>> >> >> > #3, What happens with jobs that have multiple execute calls?
>>>>> >> >> > Currently, it is really a problem. Even we use a local client on 
>>>>> >> >> > Flink
>>>>> >> >> > master side, it will have different behavior with
>>>>> >> >> > client mode. For client mode, if we execute multiple times, then 
>>>>> >> >> > we will
>>>>> >> >> > deploy multiple Flink clusters for each execute.
>>>>> >> >> > I am not pretty sure whether it is reasonable. However, i still 
>>>>> >> >> > think using
>>>>> >> >> > the local client is a good choice. We could
>>>>> >> >> > continue the discussion in a new thread. @Zili Chen 
>>>>> >> >> > <[email protected]> Do
>>>>> >> >> > you want to drive this?
>>>>> >> >> >
>>>>> >> >> >
>>>>> >> >> >
>>>>> >> >> > Best,
>>>>> >> >> > Yang
>>>>> >> >> >
>>>>> >> >> > Peter Huang <[email protected]> 于2020年1月16日周四 上午1:55写道：
>>>>> >> >> >
>>>>> >> >> > > Hi Kostas,
>>>>> >> >> > >
>>>>> >> >> > > Thanks for this feedback. I can't agree more about the opinion. 
>>>>> >> >> > > The
>>>>> >> >> > > cluster mode should be added
>>>>> >> >> > > first in per job cluster.
>>>>> >> >> > >
>>>>> >> >> > > 1) For job cluster implementation
>>>>> >> >> > > 1. Job graph recovery from configuration or store as static job 
>>>>> >> >> > > graph as
>>>>> >> >> > > session cluster. I think the static one will be better for less 
>>>>> >> >> > > recovery
>>>>> >> >> > > time.
>>>>> >> >> > > Let me update the doc for details.
>>>>> >> >> > >
>>>>> >> >> > > 2. For job execute multiple times, I think @Zili Chen
>>>>> >> >> > > <[email protected]> has proposed the local client solution 
>>>>> >> >> > > that can
>>>>> >> >> > > the run program actually in the cluster entry point. We can put 
>>>>> >> >> > > the
>>>>> >> >> > > implementation in the second stage,
>>>>> >> >> > > or even a new FLIP for further discussion.
>>>>> >> >> > >
>>>>> >> >> > > 2) For session cluster implementation
>>>>> >> >> > > We can disable the cluster mode for the session cluster in the 
>>>>> >> >> > > first
>>>>> >> >> > > stage. I agree the jar downloading will be a painful thing.
>>>>> >> >> > > We can consider about PoC and performance evaluation first. If 
>>>>> >> >> > > the end to
>>>>> >> >> > > end experience is good enough, then we can consider
>>>>> >> >> > > proceeding with the solution.
>>>>> >> >> > >
>>>>> >> >> > > Looking forward to more opinions from @Yang Wang 
>>>>> >> >> > > <[email protected]> @Zili
>>>>> >> >> > > Chen <[email protected]> @Dian Fu <[email protected]>.
>>>>> >> >> > >
>>>>> >> >> > >
>>>>> >> >> > > Best Regards
>>>>> >> >> > > Peter Huang
>>>>> >> >> > >
>>>>> >> >> > > On Wed, Jan 15, 2020 at 7:50 AM Kostas Kloudas 
>>>>> >> >> > > <[email protected]> wrote:
>>>>> >> >> > >
>>>>> >> >> > >> Hi all,
>>>>> >> >> > >>
>>>>> >> >> > >> I am writing here as the discussion on the Google Doc seems to 
>>>>> >> >> > >> be a
>>>>> >> >> > >> bit difficult to follow.
>>>>> >> >> > >>
>>>>> >> >> > >> I think that in order to be able to make progress, it would be 
>>>>> >> >> > >> helpful
>>>>> >> >> > >> to focus on per-job mode for now.
>>>>> >> >> > >> The reason is that:
>>>>> >> >> > >>  1) making the (unique) JobSubmitHandler responsible for 
>>>>> >> >> > >> creating the
>>>>> >> >> > >> jobgraphs,
>>>>> >> >> > >>   which includes downloading dependencies, is not an optimal 
>>>>> >> >> > >> solution
>>>>> >> >> > >>  2) even if we put the responsibility on the JobMaster, 
>>>>> >> >> > >> currently each
>>>>> >> >> > >> job has its own
>>>>> >> >> > >>   JobMaster but they all run on the same process, so we have 
>>>>> >> >> > >> again a
>>>>> >> >> > >> single entity.
>>>>> >> >> > >>
>>>>> >> >> > >> Of course after this is done, and if we feel comfortable with 
>>>>> >> >> > >> the
>>>>> >> >> > >> solution, then we can go to the session mode.
>>>>> >> >> > >>
>>>>> >> >> > >> A second comment has to do with fault-tolerance in the per-job,
>>>>> >> >> > >> cluster-deploy mode.
>>>>> >> >> > >> In the document, it is suggested that upon recovery, the 
>>>>> >> >> > >> JobMaster of
>>>>> >> >> > >> each job re-creates the JobGraph.
>>>>> >> >> > >> I am just wondering if it is better to create and store the 
>>>>> >> >> > >> jobGraph
>>>>> >> >> > >> upon submission and only fetch it
>>>>> >> >> > >> upon recovery so that we have a static jobGraph.
>>>>> >> >> > >>
>>>>> >> >> > >> Finally, I have a question which is what happens with jobs 
>>>>> >> >> > >> that have
>>>>> >> >> > >> multiple execute calls?
>>>>> >> >> > >> The semantics seem to change compared to the current 
>>>>> >> >> > >> behaviour, right?
>>>>> >> >> > >>
>>>>> >> >> > >> Cheers,
>>>>> >> >> > >> Kostas
>>>>> >> >> > >>
>>>>> >> >> > >> On Wed, Jan 8, 2020 at 8:05 PM tison <[email protected]> 
>>>>> >> >> > >> wrote:
>>>>> >> >> > >> >
>>>>> >> >> > >> > not always, Yang Wang is also not yet a committer but he can 
>>>>> >> >> > >> > join the
>>>>> >> >> > >> > channel. I cannot find the id by clicking “Add new member in 
>>>>> >> >> > >> > channel” so
>>>>> >> >> > >> > come to you and ask for try out the link. Possibly I will 
>>>>> >> >> > >> > find other
>>>>> >> >> > >> ways
>>>>> >> >> > >> > but the original purpose is that the slack channel is a 
>>>>> >> >> > >> > public area we
>>>>> >> >> > >> > discuss about developing...
>>>>> >> >> > >> > Best,
>>>>> >> >> > >> > tison.
>>>>> >> >> > >> >
>>>>> >> >> > >> >
>>>>> >> >> > >> > Peter Huang <[email protected]> 于2020年1月9日周四 
>>>>> >> >> > >> > 上午2:44写道：
>>>>> >> >> > >> >
>>>>> >> >> > >> > > Hi Tison,
>>>>> >> >> > >> > >
>>>>> >> >> > >> > > I am not the committer of Flink yet. I think I can't join 
>>>>> >> >> > >> > > it also.
>>>>> >> >> > >> > >
>>>>> >> >> > >> > >
>>>>> >> >> > >> > > Best Regards
>>>>> >> >> > >> > > Peter Huang
>>>>> >> >> > >> > >
>>>>> >> >> > >> > > On Wed, Jan 8, 2020 at 9:39 AM tison 
>>>>> >> >> > >> > > <[email protected]> wrote:
>>>>> >> >> > >> > >
>>>>> >> >> > >> > > > Hi Peter,
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > > > Could you try out this link?
>>>>> >> >> > >> > > https://the-asf.slack.com/messages/CNA3ADZPH
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > > > Best,
>>>>> >> >> > >> > > > tison.
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > > > Peter Huang <[email protected]> 于2020年1月9日周四 
>>>>> >> >> > >> > > > 上午1:22写道：
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > > > > Hi Tison,
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > > > I can't join the group with shared link. Would you 
>>>>> >> >> > >> > > > > please add me
>>>>> >> >> > >> into
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > group? My slack account is huangzhenqiu0825.
>>>>> >> >> > >> > > > > Thank you in advance.
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > > > Best Regards
>>>>> >> >> > >> > > > > Peter Huang
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > > > On Wed, Jan 8, 2020 at 12:02 AM tison 
>>>>> >> >> > >> > > > > <[email protected]>
>>>>> >> >> > >> wrote:
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > > > > Hi Peter,
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > As described above, this effort should get attention 
>>>>> >> >> > >> > > > > > from people
>>>>> >> >> > >> > > > > developing
>>>>> >> >> > >> > > > > > FLIP-73 a.k.a. Executor abstractions. I recommend 
>>>>> >> >> > >> > > > > > you to join
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > public
>>>>> >> >> > >> > > > > > slack channel[1] for Flink Client API Enhancement 
>>>>> >> >> > >> > > > > > and you can
>>>>> >> >> > >> try to
>>>>> >> >> > >> > > > > share
>>>>> >> >> > >> > > > > > you detailed thoughts there. It possibly gets more 
>>>>> >> >> > >> > > > > > concrete
>>>>> >> >> > >> > > attentions.
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > Best,
>>>>> >> >> > >> > > > > > tison.
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > [1]
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > >
>>>>> >> >> > >> https://slack.com/share/IS21SJ75H/Rk8HhUly9FuEHb7oGwBZ33uL/enQtODg2MDYwNjE5MTg3LTA2MjIzNDc1M2ZjZDVlMjdlZjk1M2RkYmJhNjAwMTk2ZDZkODQ4NmY5YmI4OGRhNWJkYTViMTM1NzlmMzc4OWM
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > Peter Huang <[email protected]> 
>>>>> >> >> > >> > > > > > 于2020年1月7日周二 上午5:09写道：
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > > Dear All,
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > > Happy new year! According to existing feedback 
>>>>> >> >> > >> > > > > > > from the
>>>>> >> >> > >> community,
>>>>> >> >> > >> > > we
>>>>> >> >> > >> > > > > > > revised the doc with the consideration of session 
>>>>> >> >> > >> > > > > > > cluster
>>>>> >> >> > >> support,
>>>>> >> >> > >> > > > and
>>>>> >> >> > >> > > > > > > concrete interface changes needed and execution 
>>>>> >> >> > >> > > > > > > plan. Please
>>>>> >> >> > >> take
>>>>> >> >> > >> > > one
>>>>> >> >> > >> > > > > > more
>>>>> >> >> > >> > > > > > > round of review at your most convenient time.
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > >
>>>>> >> >> > >> https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edit#
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > > Best Regards
>>>>> >> >> > >> > > > > > > Peter Huang
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > > On Thu, Jan 2, 2020 at 11:29 AM Peter Huang <
>>>>> >> >> > >> > > > > [email protected]>
>>>>> >> >> > >> > > > > > > wrote:
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > > > > Hi Dian,
>>>>> >> >> > >> > > > > > > > Thanks for giving us valuable feedbacks.
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > 1) It's better to have a whole design for this 
>>>>> >> >> > >> > > > > > > > feature
>>>>> >> >> > >> > > > > > > > For the suggestion of enabling the cluster mode 
>>>>> >> >> > >> > > > > > > > also session
>>>>> >> >> > >> > > > > cluster, I
>>>>> >> >> > >> > > > > > > > think Flink already supported it. 
>>>>> >> >> > >> > > > > > > > WebSubmissionExtension
>>>>> >> >> > >> already
>>>>> >> >> > >> > > > > allows
>>>>> >> >> > >> > > > > > > > users to start a job with the specified jar by 
>>>>> >> >> > >> > > > > > > > using web UI.
>>>>> >> >> > >> > > > > > > > But we need to enable the feature from CLI for 
>>>>> >> >> > >> > > > > > > > both local
>>>>> >> >> > >> jar,
>>>>> >> >> > >> > > > remote
>>>>> >> >> > >> > > > > > > jar.
>>>>> >> >> > >> > > > > > > > I will align with Yang Wang first about the 
>>>>> >> >> > >> > > > > > > > details and
>>>>> >> >> > >> update
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > > design
>>>>> >> >> > >> > > > > > > > doc.
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > 2) It's better to consider the convenience for 
>>>>> >> >> > >> > > > > > > > users, such
>>>>> >> >> > >> as
>>>>> >> >> > >> > > > > debugging
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > I am wondering whether we can store the 
>>>>> >> >> > >> > > > > > > > exception in
>>>>> >> >> > >> jobgragh
>>>>> >> >> > >> > > > > > > > generation in application master. As no 
>>>>> >> >> > >> > > > > > > > streaming graph can
>>>>> >> >> > >> be
>>>>> >> >> > >> > > > > > scheduled
>>>>> >> >> > >> > > > > > > in
>>>>> >> >> > >> > > > > > > > this case, there will be no more TM will be 
>>>>> >> >> > >> > > > > > > > requested from
>>>>> >> >> > >> > > FlinkRM.
>>>>> >> >> > >> > > > > > > > If the AM is still running, users can still 
>>>>> >> >> > >> > > > > > > > query it from
>>>>> >> >> > >> CLI. As
>>>>> >> >> > >> > > > it
>>>>> >> >> > >> > > > > > > > requires more change, we can get some feedback 
>>>>> >> >> > >> > > > > > > > from <
>>>>> >> >> > >> > > > > > [email protected]
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > and @[email protected] <[email protected]>.
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > 3) It's better to consider the impact to the 
>>>>> >> >> > >> > > > > > > > stability of
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > cluster
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > I agree with Yang Wang's opinion.
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > Best Regards
>>>>> >> >> > >> > > > > > > > Peter Huang
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > > On Sun, Dec 29, 2019 at 9:44 PM Dian Fu <
>>>>> >> >> > >> [email protected]>
>>>>> >> >> > >> > > > > wrote:
>>>>> >> >> > >> > > > > > > >
>>>>> >> >> > >> > > > > > > >> Hi all,
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> Sorry to jump into this discussion. Thanks 
>>>>> >> >> > >> > > > > > > >> everyone for the
>>>>> >> >> > >> > > > > > discussion.
>>>>> >> >> > >> > > > > > > >> I'm very interested in this topic although I'm 
>>>>> >> >> > >> > > > > > > >> not an
>>>>> >> >> > >> expert in
>>>>> >> >> > >> > > > this
>>>>> >> >> > >> > > > > > > part.
>>>>> >> >> > >> > > > > > > >> So I'm glad to share my thoughts as following:
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> 1) It's better to have a whole design for this 
>>>>> >> >> > >> > > > > > > >> feature
>>>>> >> >> > >> > > > > > > >> As we know, there are two deployment modes: 
>>>>> >> >> > >> > > > > > > >> per-job mode
>>>>> >> >> > >> and
>>>>> >> >> > >> > > > session
>>>>> >> >> > >> > > > > > > >> mode. I'm wondering which mode really needs 
>>>>> >> >> > >> > > > > > > >> this feature.
>>>>> >> >> > >> As the
>>>>> >> >> > >> > > > > > design
>>>>> >> >> > >> > > > > > > doc
>>>>> >> >> > >> > > > > > > >> mentioned, per-job mode is more used for 
>>>>> >> >> > >> > > > > > > >> streaming jobs and
>>>>> >> >> > >> > > > session
>>>>> >> >> > >> > > > > > > mode is
>>>>> >> >> > >> > > > > > > >> usually used for batch jobs(Of course, the job 
>>>>> >> >> > >> > > > > > > >> types and
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > > > deployment
>>>>> >> >> > >> > > > > > > >> modes are orthogonal). Usually streaming job is 
>>>>> >> >> > >> > > > > > > >> only
>>>>> >> >> > >> needed to
>>>>> >> >> > >> > > be
>>>>> >> >> > >> > > > > > > submitted
>>>>> >> >> > >> > > > > > > >> once and it will run for days or weeks, while 
>>>>> >> >> > >> > > > > > > >> batch jobs
>>>>> >> >> > >> will be
>>>>> >> >> > >> > > > > > > submitted
>>>>> >> >> > >> > > > > > > >> more frequently compared with streaming jobs. 
>>>>> >> >> > >> > > > > > > >> This means
>>>>> >> >> > >> that
>>>>> >> >> > >> > > > maybe
>>>>> >> >> > >> > > > > > > session
>>>>> >> >> > >> > > > > > > >> mode also needs this feature. However, if we 
>>>>> >> >> > >> > > > > > > >> support this
>>>>> >> >> > >> > > feature
>>>>> >> >> > >> > > > in
>>>>> >> >> > >> > > > > > > >> session mode, the application master will 
>>>>> >> >> > >> > > > > > > >> become the new
>>>>> >> >> > >> > > > centralized
>>>>> >> >> > >> > > > > > > >> service(which should be solved). So in this 
>>>>> >> >> > >> > > > > > > >> case, it's
>>>>> >> >> > >> better to
>>>>> >> >> > >> > > > > have
>>>>> >> >> > >> > > > > > a
>>>>> >> >> > >> > > > > > > >> complete design for both per-job mode and 
>>>>> >> >> > >> > > > > > > >> session mode.
>>>>> >> >> > >> > > > Furthermore,
>>>>> >> >> > >> > > > > > > even
>>>>> >> >> > >> > > > > > > >> if we can do it phase by phase, we need to have 
>>>>> >> >> > >> > > > > > > >> a whole
>>>>> >> >> > >> picture
>>>>> >> >> > >> > > of
>>>>> >> >> > >> > > > > how
>>>>> >> >> > >> > > > > > > it
>>>>> >> >> > >> > > > > > > >> works in both per-job mode and session mode.
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> 2) It's better to consider the convenience for 
>>>>> >> >> > >> > > > > > > >> users, such
>>>>> >> >> > >> as
>>>>> >> >> > >> > > > > > debugging
>>>>> >> >> > >> > > > > > > >> After we finish this feature, the job graph 
>>>>> >> >> > >> > > > > > > >> will be
>>>>> >> >> > >> compiled in
>>>>> >> >> > >> > > > the
>>>>> >> >> > >> > > > > > > >> application master, which means that users 
>>>>> >> >> > >> > > > > > > >> cannot easily
>>>>> >> >> > >> get the
>>>>> >> >> > >> > > > > > > exception
>>>>> >> >> > >> > > > > > > >> message synchorousely in the job client if 
>>>>> >> >> > >> > > > > > > >> there are
>>>>> >> >> > >> problems
>>>>> >> >> > >> > > > during
>>>>> >> >> > >> > > > > > the
>>>>> >> >> > >> > > > > > > >> job graph compiling (especially for platform 
>>>>> >> >> > >> > > > > > > >> users), such
>>>>> >> >> > >> as the
>>>>> >> >> > >> > > > > > > resource
>>>>> >> >> > >> > > > > > > >> path is incorrect, the user program itself has 
>>>>> >> >> > >> > > > > > > >> some
>>>>> >> >> > >> problems,
>>>>> >> >> > >> > > etc.
>>>>> >> >> > >> > > > > > What
>>>>> >> >> > >> > > > > > > I'm
>>>>> >> >> > >> > > > > > > >> thinking is that maybe we should throw the 
>>>>> >> >> > >> > > > > > > >> exceptions as
>>>>> >> >> > >> early
>>>>> >> >> > >> > > as
>>>>> >> >> > >> > > > > > > possible
>>>>> >> >> > >> > > > > > > >> (during job submission stage).
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> 3) It's better to consider the impact to the 
>>>>> >> >> > >> > > > > > > >> stability of
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > > cluster
>>>>> >> >> > >> > > > > > > >> If we perform the compiling in the application 
>>>>> >> >> > >> > > > > > > >> master, we
>>>>> >> >> > >> should
>>>>> >> >> > >> > > > > > > consider
>>>>> >> >> > >> > > > > > > >> the impact of the compiling errors. Although 
>>>>> >> >> > >> > > > > > > >> YARN could
>>>>> >> >> > >> resume
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > > > >> application master in case of failures, but in 
>>>>> >> >> > >> > > > > > > >> some case
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > > compiling
>>>>> >> >> > >> > > > > > > >> failure may be a waste of cluster resource and 
>>>>> >> >> > >> > > > > > > >> may impact
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > > > stability
>>>>> >> >> > >> > > > > > > the
>>>>> >> >> > >> > > > > > > >> cluster and the other jobs in the cluster, such 
>>>>> >> >> > >> > > > > > > >> as the
>>>>> >> >> > >> resource
>>>>> >> >> > >> > > > path
>>>>> >> >> > >> > > > > > is
>>>>> >> >> > >> > > > > > > >> incorrect, the user program itself has some 
>>>>> >> >> > >> > > > > > > >> problems(in
>>>>> >> >> > >> this
>>>>> >> >> > >> > > case,
>>>>> >> >> > >> > > > > job
>>>>> >> >> > >> > > > > > > >> failover cannot solve this kind of problems) 
>>>>> >> >> > >> > > > > > > >> etc. In the
>>>>> >> >> > >> current
>>>>> >> >> > >> > > > > > > >> implemention, the compiling errors are handled 
>>>>> >> >> > >> > > > > > > >> in the
>>>>> >> >> > >> client
>>>>> >> >> > >> > > side
>>>>> >> >> > >> > > > > and
>>>>> >> >> > >> > > > > > > there
>>>>> >> >> > >> > > > > > > >> is no impact to the cluster at all.
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> Regarding to 1), it's clearly pointed in the 
>>>>> >> >> > >> > > > > > > >> design doc
>>>>> >> >> > >> that
>>>>> >> >> > >> > > only
>>>>> >> >> > >> > > > > > > per-job
>>>>> >> >> > >> > > > > > > >> mode will be supported. However, I think it's 
>>>>> >> >> > >> > > > > > > >> better to
>>>>> >> >> > >> also
>>>>> >> >> > >> > > > > consider
>>>>> >> >> > >> > > > > > > the
>>>>> >> >> > >> > > > > > > >> session mode in the design doc.
>>>>> >> >> > >> > > > > > > >> Regarding to 2) and 3), I have not seen related 
>>>>> >> >> > >> > > > > > > >> sections
>>>>> >> >> > >> in the
>>>>> >> >> > >> > > > > design
>>>>> >> >> > >> > > > > > > >> doc. It will be good if we can cover them in 
>>>>> >> >> > >> > > > > > > >> the design
>>>>> >> >> > >> doc.
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> Feel free to correct me If there is anything I
>>>>> >> >> > >> misunderstand.
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> Regards,
>>>>> >> >> > >> > > > > > > >> Dian
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >> > 在 2019年12月27日，上午3:13，Peter Huang <
>>>>> >> >> > >> [email protected]>
>>>>> >> >> > >> > > > 写道：
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > Hi Yang,
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > I can't agree more. The effort definitely 
>>>>> >> >> > >> > > > > > > >> > needs to align
>>>>> >> >> > >> with
>>>>> >> >> > >> > > > the
>>>>> >> >> > >> > > > > > > final
>>>>> >> >> > >> > > > > > > >> > goal of FLIP-73.
>>>>> >> >> > >> > > > > > > >> > I am thinking about whether we can achieve 
>>>>> >> >> > >> > > > > > > >> > the goal with
>>>>> >> >> > >> two
>>>>> >> >> > >> > > > > phases.
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > 1) Phase I
>>>>> >> >> > >> > > > > > > >> > As the CLiFrontend will not be depreciated 
>>>>> >> >> > >> > > > > > > >> > soon. We can
>>>>> >> >> > >> still
>>>>> >> >> > >> > > > use
>>>>> >> >> > >> > > > > > the
>>>>> >> >> > >> > > > > > > >> > deployMode flag there,
>>>>> >> >> > >> > > > > > > >> > pass the program info through Flink 
>>>>> >> >> > >> > > > > > > >> > configuration,  use
>>>>> >> >> > >> the
>>>>> >> >> > >> > > > > > > >> > ClassPathJobGraphRetriever
>>>>> >> >> > >> > > > > > > >> > to generate the job graph in 
>>>>> >> >> > >> > > > > > > >> > ClusterEntrypoints of yarn
>>>>> >> >> > >> and
>>>>> >> >> > >> > > > > > > Kubernetes.
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > 2) Phase II
>>>>> >> >> > >> > > > > > > >> > In  AbstractJobClusterExecutor, the job graph 
>>>>> >> >> > >> > > > > > > >> > is
>>>>> >> >> > >> generated in
>>>>> >> >> > >> > > > the
>>>>> >> >> > >> > > > > > > >> execute
>>>>> >> >> > >> > > > > > > >> > function. We can still
>>>>> >> >> > >> > > > > > > >> > use the deployMode in it. With deployMode = 
>>>>> >> >> > >> > > > > > > >> > cluster, the
>>>>> >> >> > >> > > execute
>>>>> >> >> > >> > > > > > > >> function
>>>>> >> >> > >> > > > > > > >> > only starts the cluster.
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > When 
>>>>> >> >> > >> > > > > > > >> > {Yarn/Kuberneates}PerJobClusterEntrypoint 
>>>>> >> >> > >> > > > > > > >> > starts,
>>>>> >> >> > >> It will
>>>>> >> >> > >> > > > > start
>>>>> >> >> > >> > > > > > > the
>>>>> >> >> > >> > > > > > > >> > dispatch first, then we can use
>>>>> >> >> > >> > > > > > > >> > a ClusterEnvironment similar to 
>>>>> >> >> > >> > > > > > > >> > ContextEnvironment to
>>>>> >> >> > >> submit
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > job
>>>>> >> >> > >> > > > > > > >> with
>>>>> >> >> > >> > > > > > > >> > jobName the local
>>>>> >> >> > >> > > > > > > >> > dispatcher. For the details, we need more 
>>>>> >> >> > >> > > > > > > >> > investigation.
>>>>> >> >> > >> Let's
>>>>> >> >> > >> > > > > wait
>>>>> >> >> > >> > > > > > > >> > for @Aljoscha
>>>>> >> >> > >> > > > > > > >> > Krettek <[email protected]> @Till Rohrmann <
>>>>> >> >> > >> > > > > [email protected]
>>>>> >> >> > >> > > > > > >'s
>>>>> >> >> > >> > > > > > > >> > feedback after the holiday season.
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > Thank you in advance. Merry Chrismas and 
>>>>> >> >> > >> > > > > > > >> > Happy New
>>>>> >> >> > >> Year!!!
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > Best Regards
>>>>> >> >> > >> > > > > > > >> > Peter Huang
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> > On Wed, Dec 25, 2019 at 1:08 AM Yang Wang <
>>>>> >> >> > >> > > > [email protected]>
>>>>> >> >> > >> > > > > > > >> wrote:
>>>>> >> >> > >> > > > > > > >> >
>>>>> >> >> > >> > > > > > > >> >> Hi Peter,
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >> I think we need to reconsider tison's 
>>>>> >> >> > >> > > > > > > >> >> suggestion
>>>>> >> >> > >> seriously.
>>>>> >> >> > >> > > > After
>>>>> >> >> > >> > > > > > > >> FLIP-73,
>>>>> >> >> > >> > > > > > > >> >> the deployJobCluster has
>>>>> >> >> > >> > > > > > > >> >> beenmoved into `JobClusterExecutor#execute`. 
>>>>> >> >> > >> > > > > > > >> >> It should
>>>>> >> >> > >> not be
>>>>> >> >> > >> > > > > > > perceived
>>>>> >> >> > >> > > > > > > >> >> for `CliFrontend`. That
>>>>> >> >> > >> > > > > > > >> >> means the user program will *ALWAYS* be 
>>>>> >> >> > >> > > > > > > >> >> executed on
>>>>> >> >> > >> client
>>>>> >> >> > >> > > > side.
>>>>> >> >> > >> > > > > > This
>>>>> >> >> > >> > > > > > > >> is
>>>>> >> >> > >> > > > > > > >> >> the by design behavior.
>>>>> >> >> > >> > > > > > > >> >> So, we could not just add `if(client mode) 
>>>>> >> >> > >> > > > > > > >> >> .. else
>>>>> >> >> > >> if(cluster
>>>>> >> >> > >> > > > > mode)
>>>>> >> >> > >> > > > > > > >> ...`
>>>>> >> >> > >> > > > > > > >> >> codes in `CliFrontend` to bypass
>>>>> >> >> > >> > > > > > > >> >> the executor. We need to find a clean way to 
>>>>> >> >> > >> > > > > > > >> >> decouple
>>>>> >> >> > >> > > executing
>>>>> >> >> > >> > > > > > user
>>>>> >> >> > >> > > > > > > >> >> program and deploying per-job
>>>>> >> >> > >> > > > > > > >> >> cluster. Based on this, we could support to 
>>>>> >> >> > >> > > > > > > >> >> execute user
>>>>> >> >> > >> > > > program
>>>>> >> >> > >> > > > > on
>>>>> >> >> > >> > > > > > > >> client
>>>>> >> >> > >> > > > > > > >> >> or master side.
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >> Maybe Aljoscha and Jeff could give some good
>>>>> >> >> > >> suggestions.
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >> Best,
>>>>> >> >> > >> > > > > > > >> >> Yang
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >> Peter Huang <[email protected]> 
>>>>> >> >> > >> > > > > > > >> >> 于2019年12月25日周三
>>>>> >> >> > >> > > > > 上午4:03写道：
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >> >>> Hi Jingjing,
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>> The improvement proposed is a deployment 
>>>>> >> >> > >> > > > > > > >> >>> option for
>>>>> >> >> > >> CLI. For
>>>>> >> >> > >> > > > SQL
>>>>> >> >> > >> > > > > > > based
>>>>> >> >> > >> > > > > > > >> >>> Flink application, It is more convenient to 
>>>>> >> >> > >> > > > > > > >> >>> use the
>>>>> >> >> > >> existing
>>>>> >> >> > >> > > > > model
>>>>> >> >> > >> > > > > > > in
>>>>> >> >> > >> > > > > > > >> >>> SqlClient in which
>>>>> >> >> > >> > > > > > > >> >>> the job graph is generated within 
>>>>> >> >> > >> > > > > > > >> >>> SqlClient. After
>>>>> >> >> > >> adding
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > > > delayed
>>>>> >> >> > >> > > > > > > >> job
>>>>> >> >> > >> > > > > > > >> >>> graph generation, I think there is no 
>>>>> >> >> > >> > > > > > > >> >>> change is needed
>>>>> >> >> > >> for
>>>>> >> >> > >> > > > your
>>>>> >> >> > >> > > > > > > side.
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>> Best Regards
>>>>> >> >> > >> > > > > > > >> >>> Peter Huang
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>> On Wed, Dec 18, 2019 at 6:01 AM jingjing 
>>>>> >> >> > >> > > > > > > >> >>> bai <
>>>>> >> >> > >> > > > > > > >> [email protected]>
>>>>> >> >> > >> > > > > > > >> >>> wrote:
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>>> hi peter:
>>>>> >> >> > >> > > > > > > >> >>>>    we had extension SqlClent to support 
>>>>> >> >> > >> > > > > > > >> >>>> sql job
>>>>> >> >> > >> submit in
>>>>> >> >> > >> > > web
>>>>> >> >> > >> > > > > > base
>>>>> >> >> > >> > > > > > > on
>>>>> >> >> > >> > > > > > > >> >>>> flink 1.9.   we support submit to yarn on 
>>>>> >> >> > >> > > > > > > >> >>>> per job
>>>>> >> >> > >> mode too.
>>>>> >> >> > >> > > > > > > >> >>>>    in this case, the job graph generated  
>>>>> >> >> > >> > > > > > > >> >>>> on client
>>>>> >> >> > >> side
>>>>> >> >> > >> > > .  I
>>>>> >> >> > >> > > > > > think
>>>>> >> >> > >> > > > > > > >> >>> this
>>>>> >> >> > >> > > > > > > >> >>>> discuss Mainly to improve api programme.  
>>>>> >> >> > >> > > > > > > >> >>>> but in my
>>>>> >> >> > >> case ,
>>>>> >> >> > >> > > > > there
>>>>> >> >> > >> > > > > > is
>>>>> >> >> > >> > > > > > > >> no
>>>>> >> >> > >> > > > > > > >> >>>> jar to upload but only a sql string .
>>>>> >> >> > >> > > > > > > >> >>>>    do u had more suggestion to improve for 
>>>>> >> >> > >> > > > > > > >> >>>> sql mode
>>>>> >> >> > >> or it
>>>>> >> >> > >> > > is
>>>>> >> >> > >> > > > > > only a
>>>>> >> >> > >> > > > > > > >> >>>> switch for api programme？
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>> best
>>>>> >> >> > >> > > > > > > >> >>>> bai jj
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>> Yang Wang <[email protected]> 
>>>>> >> >> > >> > > > > > > >> >>>> 于2019年12月18日周三
>>>>> >> >> > >> 下午7:21写道：
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>>> I just want to revive this discussion.
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>> Recently, i am thinking about how to 
>>>>> >> >> > >> > > > > > > >> >>>>> natively run
>>>>> >> >> > >> flink
>>>>> >> >> > >> > > > > per-job
>>>>> >> >> > >> > > > > > > >> >>> cluster on
>>>>> >> >> > >> > > > > > > >> >>>>> Kubernetes.
>>>>> >> >> > >> > > > > > > >> >>>>> The per-job mode on Kubernetes is very 
>>>>> >> >> > >> > > > > > > >> >>>>> different
>>>>> >> >> > >> from on
>>>>> >> >> > >> > > > Yarn.
>>>>> >> >> > >> > > > > > And
>>>>> >> >> > >> > > > > > > >> we
>>>>> >> >> > >> > > > > > > >> >>> will
>>>>> >> >> > >> > > > > > > >> >>>>> have
>>>>> >> >> > >> > > > > > > >> >>>>> the same deployment requirements to the 
>>>>> >> >> > >> > > > > > > >> >>>>> client and
>>>>> >> >> > >> entry
>>>>> >> >> > >> > > > > point.
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>> 1. Flink client not always need a local 
>>>>> >> >> > >> > > > > > > >> >>>>> jar to start
>>>>> >> >> > >> a
>>>>> >> >> > >> > > Flink
>>>>> >> >> > >> > > > > > > per-job
>>>>> >> >> > >> > > > > > > >> >>>>> cluster. We could
>>>>> >> >> > >> > > > > > > >> >>>>> support multiple schemas. For example,
>>>>> >> >> > >> > > > file:///path/of/my.jar
>>>>> >> >> > >> > > > > > > means
>>>>> >> >> > >> > > > > > > >> a
>>>>> >> >> > >> > > > > > > >> >>> jar
>>>>> >> >> > >> > > > > > > >> >>>>> located
>>>>> >> >> > >> > > > > > > >> >>>>> at client side,
>>>>> >> >> > >> hdfs://myhdfs/user/myname/flink/my.jar
>>>>> >> >> > >> > > > means a
>>>>> >> >> > >> > > > > > jar
>>>>> >> >> > >> > > > > > > >> >>> located
>>>>> >> >> > >> > > > > > > >> >>>>> at
>>>>> >> >> > >> > > > > > > >> >>>>> remote hdfs, 
>>>>> >> >> > >> > > > > > > >> >>>>> local:///path/in/image/my.jar means a
>>>>> >> >> > >> jar
>>>>> >> >> > >> > > > located
>>>>> >> >> > >> > > > > > at
>>>>> >> >> > >> > > > > > > >> >>>>> jobmanager side.
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>> 2. Support running user program on master 
>>>>> >> >> > >> > > > > > > >> >>>>> side. This
>>>>> >> >> > >> also
>>>>> >> >> > >> > > > > means
>>>>> >> >> > >> > > > > > > the
>>>>> >> >> > >> > > > > > > >> >>> entry
>>>>> >> >> > >> > > > > > > >> >>>>> point
>>>>> >> >> > >> > > > > > > >> >>>>> will generate the job graph on master 
>>>>> >> >> > >> > > > > > > >> >>>>> side. We could
>>>>> >> >> > >> use
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > > > >> >>>>> ClasspathJobGraphRetriever
>>>>> >> >> > >> > > > > > > >> >>>>> or start a local Flink client to achieve 
>>>>> >> >> > >> > > > > > > >> >>>>> this
>>>>> >> >> > >> purpose.
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>> cc tison, Aljoscha & Kostas Do you think 
>>>>> >> >> > >> > > > > > > >> >>>>> this is the
>>>>> >> >> > >> right
>>>>> >> >> > >> > > > > > > >> direction we
>>>>> >> >> > >> > > > > > > >> >>>>> need to work?
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>> tison <[email protected]> 
>>>>> >> >> > >> > > > > > > >> >>>>> 于2019年12月12日周四
>>>>> >> >> > >> 下午4:48写道：
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>> A quick idea is that we separate the 
>>>>> >> >> > >> > > > > > > >> >>>>>> deployment
>>>>> >> >> > >> from user
>>>>> >> >> > >> > > > > > program
>>>>> >> >> > >> > > > > > > >> >>> that
>>>>> >> >> > >> > > > > > > >> >>>>> it
>>>>> >> >> > >> > > > > > > >> >>>>>> has always been done
>>>>> >> >> > >> > > > > > > >> >>>>>> outside the program. On user program 
>>>>> >> >> > >> > > > > > > >> >>>>>> executed there
>>>>> >> >> > >> is
>>>>> >> >> > >> > > > > always a
>>>>> >> >> > >> > > > > > > >> >>>>>> ClusterClient that communicates with
>>>>> >> >> > >> > > > > > > >> >>>>>> an existing cluster, remote or local. It 
>>>>> >> >> > >> > > > > > > >> >>>>>> will be
>>>>> >> >> > >> another
>>>>> >> >> > >> > > > > thread
>>>>> >> >> > >> > > > > > > so
>>>>> >> >> > >> > > > > > > >> >>> just
>>>>> >> >> > >> > > > > > > >> >>>>> for
>>>>> >> >> > >> > > > > > > >> >>>>>> your information.
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>> Best,
>>>>> >> >> > >> > > > > > > >> >>>>>> tison.
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>> tison <[email protected]> 
>>>>> >> >> > >> > > > > > > >> >>>>>> 于2019年12月12日周四
>>>>> >> >> > >> 下午4:40写道：
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> Hi Peter,
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> Another concern I realized recently is 
>>>>> >> >> > >> > > > > > > >> >>>>>>> that with
>>>>> >> >> > >> current
>>>>> >> >> > >> > > > > > > Executors
>>>>> >> >> > >> > > > > > > >> >>>>>>> abstraction(FLIP-73)
>>>>> >> >> > >> > > > > > > >> >>>>>>> I'm afraid that user program is 
>>>>> >> >> > >> > > > > > > >> >>>>>>> designed to ALWAYS
>>>>> >> >> > >> run
>>>>> >> >> > >> > > on
>>>>> >> >> > >> > > > > the
>>>>> >> >> > >> > > > > > > >> >>> client
>>>>> >> >> > >> > > > > > > >> >>>>>> side.
>>>>> >> >> > >> > > > > > > >> >>>>>>> Specifically,
>>>>> >> >> > >> > > > > > > >> >>>>>>> we deploy the job in executor when 
>>>>> >> >> > >> > > > > > > >> >>>>>>> env.execute
>>>>> >> >> > >> called.
>>>>> >> >> > >> > > > This
>>>>> >> >> > >> > > > > > > >> >>>>> abstraction
>>>>> >> >> > >> > > > > > > >> >>>>>>> possibly prevents
>>>>> >> >> > >> > > > > > > >> >>>>>>> Flink runs user program on the cluster 
>>>>> >> >> > >> > > > > > > >> >>>>>>> side.
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> For your proposal, in this case we 
>>>>> >> >> > >> > > > > > > >> >>>>>>> already
>>>>> >> >> > >> compiled the
>>>>> >> >> > >> > > > > > program
>>>>> >> >> > >> > > > > > > >> and
>>>>> >> >> > >> > > > > > > >> >>>>> run
>>>>> >> >> > >> > > > > > > >> >>>>>> on
>>>>> >> >> > >> > > > > > > >> >>>>>>> the client side,
>>>>> >> >> > >> > > > > > > >> >>>>>>> even we deploy a cluster and retrieve 
>>>>> >> >> > >> > > > > > > >> >>>>>>> job graph
>>>>> >> >> > >> from
>>>>> >> >> > >> > > > program
>>>>> >> >> > >> > > > > > > >> >>>>> metadata, it
>>>>> >> >> > >> > > > > > > >> >>>>>>> doesn't make
>>>>> >> >> > >> > > > > > > >> >>>>>>> many sense.
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> cc Aljoscha & Kostas what do you think 
>>>>> >> >> > >> > > > > > > >> >>>>>>> about this
>>>>> >> >> > >> > > > > constraint?
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> Best,
>>>>> >> >> > >> > > > > > > >> >>>>>>> tison.
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>> Peter Huang <[email protected]>
>>>>> >> >> > >> 于2019年12月10日周二
>>>>> >> >> > >> > > > > > > >> 下午12:45写道：
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Hi Tison,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Yes, you are right. I think I made the 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> wrong
>>>>> >> >> > >> argument
>>>>> >> >> > >> > > in
>>>>> >> >> > >> > > > > the
>>>>> >> >> > >> > > > > > > doc.
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Basically, the packaging jar problem 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> is only for
>>>>> >> >> > >> > > platform
>>>>> >> >> > >> > > > > > > users.
>>>>> >> >> > >> > > > > > > >> >>> In
>>>>> >> >> > >> > > > > > > >> >>>>> our
>>>>> >> >> > >> > > > > > > >> >>>>>>>> internal deploy service,
>>>>> >> >> > >> > > > > > > >> >>>>>>>> we further optimized the deployment 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> latency by
>>>>> >> >> > >> letting
>>>>> >> >> > >> > > > > users
>>>>> >> >> > >> > > > > > to
>>>>> >> >> > >> > > > > > > >> >>>>>> packaging
>>>>> >> >> > >> > > > > > > >> >>>>>>>> flink-runtime together with the uber 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> jar, so that
>>>>> >> >> > >> we
>>>>> >> >> > >> > > > don't
>>>>> >> >> > >> > > > > > need
>>>>> >> >> > >> > > > > > > >> to
>>>>> >> >> > >> > > > > > > >> >>>>>>>> consider
>>>>> >> >> > >> > > > > > > >> >>>>>>>> multiple flink version
>>>>> >> >> > >> > > > > > > >> >>>>>>>> support for now. In the session client 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> mode, as
>>>>> >> >> > >> Flink
>>>>> >> >> > >> > > > libs
>>>>> >> >> > >> > > > > > will
>>>>> >> >> > >> > > > > > > >> be
>>>>> >> >> > >> > > > > > > >> >>>>>> shipped
>>>>> >> >> > >> > > > > > > >> >>>>>>>> anyway as local resources of yarn. 
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Users actually
>>>>> >> >> > >> don't
>>>>> >> >> > >> > > > > need
>>>>> >> >> > >> > > > > > to
>>>>> >> >> > >> > > > > > > >> >>>>> package
>>>>> >> >> > >> > > > > > > >> >>>>>>>> those libs into job jar.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Best Regards
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Peter Huang
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>> On Mon, Dec 9, 2019 at 8:35 PM tison <
>>>>> >> >> > >> > > > [email protected]
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > > > > >> >>> wrote:
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> package? Do users
>>>>> >> >> > >> need
>>>>> >> >> > >> > > to
>>>>> >> >> > >> > > > > > > >> >>> compile
>>>>> >> >> > >> > > > > > > >> >>>>>> their
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> inlcuding flink-clients, 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> flink-optimizer,
>>>>> >> >> > >> flink-table
>>>>> >> >> > >> > > > > codes?
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> The answer should be no because they 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> exist in
>>>>> >> >> > >> system
>>>>> >> >> > >> > > > > > > classpath.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> Best,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> tison.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> Yang Wang <[email protected]> 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> 于2019年12月10日周二
>>>>> >> >> > >> > > > > 下午12:18写道：
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Hi Peter,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Thanks a lot for starting this 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> discussion. I
>>>>> >> >> > >> think
>>>>> >> >> > >> > > this
>>>>> >> >> > >> > > > > is
>>>>> >> >> > >> > > > > > a
>>>>> >> >> > >> > > > > > > >> >>> very
>>>>> >> >> > >> > > > > > > >> >>>>>>>> useful
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> feature.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Not only for Yarn, i am focused on 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink on
>>>>> >> >> > >> > > Kubernetes
>>>>> >> >> > >> > > > > > > >> >>>>> integration
>>>>> >> >> > >> > > > > > > >> >>>>>> and
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> come
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> across the same
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> problem. I do not want the job graph 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> generated
>>>>> >> >> > >> on
>>>>> >> >> > >> > > > client
>>>>> >> >> > >> > > > > > > side.
>>>>> >> >> > >> > > > > > > >> >>>>>>>> Instead,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> the
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> user jars are built in
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> a user-defined image. When the job 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> manager
>>>>> >> >> > >> launched,
>>>>> >> >> > >> > > we
>>>>> >> >> > >> > > > > > just
>>>>> >> >> > >> > > > > > > >> >>>>> need to
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> generate the job graph
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> based on local user jars.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> I have some small suggestion about 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> this.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 1. `ProgramJobGraphRetriever` is 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> very similar to
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ClasspathJobGraphRetriever`, the 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> differences
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> are the former needs 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> `ProgramMetadata` and the
>>>>> >> >> > >> latter
>>>>> >> >> > >> > > > > needs
>>>>> >> >> > >> > > > > > > >> >>> some
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> arguments.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Is it possible to
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> have an unified `JobGraphRetriever` 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> to support
>>>>> >> >> > >> both?
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 2. Is it possible to not use a local 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> user jar to
>>>>> >> >> > >> > > start
>>>>> >> >> > >> > > > a
>>>>> >> >> > >> > > > > > > >> >>> per-job
>>>>> >> >> > >> > > > > > > >> >>>>>>>> cluster?
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> In your case, the user jars has
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> existed on hdfs already and we do 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> need to
>>>>> >> >> > >> download
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > jars
>>>>> >> >> > >> > > > > > > to
>>>>> >> >> > >> > > > > > > >> >>>>>>>> deployer
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> service. Currently, we
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> always need a local user jar to 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> start a flink
>>>>> >> >> > >> > > cluster.
>>>>> >> >> > >> > > > It
>>>>> >> >> > >> > > > > > is
>>>>> >> >> > >> > > > > > > >> >>> be
>>>>> >> >> > >> > > > > > > >> >>>>>> great
>>>>> >> >> > >> > > > > > > >> >>>>>>>> if
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> we
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> could support remote user jars.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> In the implementation, we assume 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>> users package
>>>>> >> >> > >> > > > > > > >> >>> flink-clients,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer, flink-table 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> together within
>>>>> >> >> > >> the job
>>>>> >> >> > >> > > > jar.
>>>>> >> >> > >> > > > > > > >> >>>>> Otherwise,
>>>>> >> >> > >> > > > > > > >> >>>>>>>> the
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> job graph generation within
>>>>> >> >> > >> JobClusterEntryPoint will
>>>>> >> >> > >> > > > > fail.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> package? Do users
>>>>> >> >> > >> need
>>>>> >> >> > >> > > to
>>>>> >> >> > >> > > > > > > >> >>> compile
>>>>> >> >> > >> > > > > > > >> >>>>>> their
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> jars
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> inlcuding flink-clients, 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer,
>>>>> >> >> > >> flink-table
>>>>> >> >> > >> > > > > > codes?
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Best,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Yang
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> Peter Huang 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> <[email protected]>
>>>>> >> >> > >> > > > 于2019年12月10日周二
>>>>> >> >> > >> > > > > > > >> >>>>> 上午2:37写道：
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Dear All,
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Recently, the Flink community 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> starts to
>>>>> >> >> > >> improve the
>>>>> >> >> > >> > > > yarn
>>>>> >> >> > >> > > > > > > >> >>>>> cluster
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> descriptor
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> to make job jar and config files 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> configurable
>>>>> >> >> > >> from
>>>>> >> >> > >> > > > CLI.
>>>>> >> >> > >> > > > > It
>>>>> >> >> > >> > > > > > > >> >>>>>> improves
>>>>> >> >> > >> > > > > > > >> >>>>>>>> the
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> flexibility of  Flink deployment 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Yarn Per Job
>>>>> >> >> > >> Mode.
>>>>> >> >> > >> > > > For
>>>>> >> >> > >> > > > > > > >> >>>>> platform
>>>>> >> >> > >> > > > > > > >> >>>>>>>> users
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>> who
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> manage tens of hundreds of 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> streaming pipelines
>>>>> >> >> > >> for
>>>>> >> >> > >> > > the
>>>>> >> >> > >> > > > > > whole
>>>>> >> >> > >> > > > > > > >> >>>>> org
>>>>> >> >> > >> > > > > > > >> >>>>>> or
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> company, we found the job graph 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in
>>>>> >> >> > >> > > > > client-side
>>>>> >> >> > >> > > > > > is
>>>>> >> >> > >> > > > > > > >> >>>>>> another
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> pinpoint. Thus, we want to propose a
>>>>> >> >> > >> configurable
>>>>> >> >> > >> > > > > feature
>>>>> >> >> > >> > > > > > > >> >>> for
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FlinkYarnSessionCli. The feature 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> can allow
>>>>> >> >> > >> users to
>>>>> >> >> > >> > > > > choose
>>>>> >> >> > >> > > > > > > >> >>> the
>>>>> >> >> > >> > > > > > > >> >>>>> job
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> graph
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in Flink 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> ClusterEntryPoint so that
>>>>> >> >> > >> the
>>>>> >> >> > >> > > job
>>>>> >> >> > >> > > > > jar
>>>>> >> >> > >> > > > > > > >> >>>>> doesn't
>>>>> >> >> > >> > > > > > > >> >>>>>>>> need
>>>>> >> >> > >> > > > > > > >> >>>>>>>>> to
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> be locally for the job graph 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> generation. The
>>>>> >> >> > >> > > proposal
>>>>> >> >> > >> > > > is
>>>>> >> >> > >> > > > > > > >> >>>>> organized
>>>>> >> >> > >> > > > > > > >> >>>>>>>> as a
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> FLIP
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > >
>>>>> >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> .
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Any questions and suggestions are 
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> welcomed.
>>>>> >> >> > >> Thank
>>>>> >> >> > >> > > you
>>>>> >> >> > >> > > > in
>>>>> >> >> > >> > > > > > > >> >>>>> advance.
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Best Regards
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>> Peter Huang
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>>
>>>>> >> >> > >> > > > > > > >> >>>>>
>>>>> >> >> > >> > > > > > > >> >>>>
>>>>> >> >> > >> > > > > > > >> >>>
>>>>> >> >> > >> > > > > > > >> >>
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > > >>
>>>>> >> >> > >> > > > > > >
>>>>> >> >> > >> > > > > >
>>>>> >> >> > >> > > > >
>>>>> >> >> > >> > > >
>>>>> >> >> > >> > >
>>>>> >> >> > >>
>>>>> >> >> > >
>>>>> >> >>

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

Reply via email to