Hi Peter,

Having the application mode does not mean we will drop the cluster-deploy
option. I just want to share some thoughts about “Application Mode”.


1. The application mode could cover the per-job sematic. Its lifecyle is
bound
to the user `main()`. And all the jobs in the user main will be executed in
a same
Flink cluster. In first phase of FLIP-85 implementation, running user main
on the
cluster side could be supported in application mode.

2. Maybe in the future, we also need to support multiple `execute()` on
client side
in a same Flink cluster. Then the per-job mode will evolve to application
mode.

3. From user perspective, only a `-R/-- remote-deploy` cli option is
visible. They
are not aware of the application mode.

4. In the first phase, the application mode is working as “per-job”(only
one job in
the user main). We just leave more potential for the future.


I am not against with calling it “cluster deploy mode” if you all think it
is clearer for users.



Best,
Yang

Kostas Kloudas <kklou...@gmail.com> 于2020年3月3日周二 下午6:49写道:

> Hi Peter,
>
> I understand your point. This is why I was also a bit torn about the
> name and my proposal was a bit aligned with yours (something along the
> lines of "cluster deploy" mode).
>
> But many of the other participants in the discussion suggested the
> "Application Mode". I think that the reasoning is that now the user's
> Application is more self-contained.
> It will be submitted to the cluster and the user can just disconnect.
> In addition, as discussed briefly in the doc, in the future there may
> be better support for multi-execute applications which will bring us
> one step closer to the true "Application Mode". But this is how I
> interpreted their arguments, of course they can also express their
> thoughts on the topic :)
>
> Cheers,
> Kostas
>
> On Mon, Mar 2, 2020 at 6:15 PM Peter Huang <huangzhenqiu0...@gmail.com>
> wrote:
> >
> > Hi Kostas,
> >
> > Thanks for updating the wiki. We have aligned with the implementations
> in the doc. But I feel it is still a little bit confusing of the naming
> from a user's perspective. It is well known that Flink support per job
> cluster and session cluster. The concept is in the layer of how a job is
> managed within Flink. The method introduced util now is a kind of mixing
> job and session cluster to promising the implementation complexity. We
> probably don't need to label it as Application Model as the same layer of
> per job cluster and session cluster. Conceptually, I think it is still a
> cluster mode implementation for per job cluster.
> >
> > To minimize the confusion of users, I think it would be better just an
> option of per job cluster for each type of cluster manager. How do you
> think?
> >
> >
> > Best Regards
> > Peter Huang
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Mar 2, 2020 at 7:22 AM Kostas Kloudas <kklou...@gmail.com>
> wrote:
> >>
> >> Hi Yang,
> >>
> >> The difference between per-job and application mode is that, as you
> >> described, in the per-job mode the main is executed on the client
> >> while in the application mode, the main is executed on the cluster.
> >> I do not think we have to offer "application mode" with running the
> >> main on the client side as this is exactly what the per-job mode does
> >> currently and, as you described also, it would be redundant.
> >>
> >> Sorry if this was not clear in the document.
> >>
> >> Cheers,
> >> Kostas
> >>
> >> On Mon, Mar 2, 2020 at 3:17 PM Yang Wang <danrtsey...@gmail.com> wrote:
> >> >
> >> > Hi Kostas,
> >> >
> >> > Thanks a lot for your conclusion and updating the FLIP-85 WIKI.
> Currently, i have no more
> >> > questions about motivation, approach, fault tolerance and the first
> phase implementation.
> >> >
> >> > I think the new title "Flink Application Mode" makes a lot senses to
> me. Especially for the
> >> > containerized environment, the cluster deploy option will be very
> useful.
> >> >
> >> > Just one concern, how do we introduce this new application mode to
> our users?
> >> > Each user program(i.e. `main()`) is an application. Currently, we
> intend to only support one
> >> > `execute()`. So what's the difference between per-job and application
> mode?
> >> >
> >> > For per-job, user `main()` is always executed on client side. And For
> application mode, user
> >> > `main()` could be executed on client or master side(configured via
> cli option).
> >> > Right? We need to have a clear concept. Otherwise, the users will be
> more and more confusing.
> >> >
> >> >
> >> > Best,
> >> > Yang
> >> >
> >> > Kostas Kloudas <kklou...@gmail.com> 于2020年3月2日周一 下午5:58写道:
> >> >>
> >> >> Hi all,
> >> >>
> >> >> I update
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Flink+Application+Mode
> >> >> based on the discussion we had here:
> >> >>
> >> >>
> https://docs.google.com/document/d/1ji72s3FD9DYUyGuKnJoO4ApzV-nSsZa0-bceGXW7Ocw/edit#
> >> >>
> >> >> Please let me know what you think and please keep the discussion in
> the ML :)
> >> >>
> >> >> Thanks for starting the discussion and I hope that soon we will be
> >> >> able to vote on the FLIP.
> >> >>
> >> >> Cheers,
> >> >> Kostas
> >> >>
> >> >> On Thu, Jan 16, 2020 at 3:40 AM Yang Wang <danrtsey...@gmail.com>
> wrote:
> >> >> >
> >> >> > Hi all,
> >> >> >
> >> >> > Thanks a lot for the feedback from @Kostas Kloudas. Your all
> concerns are
> >> >> > on point. The FLIP-85 is mainly
> >> >> > focused on supporting cluster mode for per-job. Since it is more
> urgent and
> >> >> > have much more use
> >> >> > cases both in Yarn and Kubernetes deployment. For session cluster,
> we could
> >> >> > have more discussion
> >> >> > in a new thread later.
> >> >> >
> >> >> > #1, How to download the user jars and dependencies for per-job in
> cluster
> >> >> > mode?
> >> >> > For Yarn, we could register the user jars and dependencies as
> >> >> > LocalResource. They will be distributed
> >> >> > by Yarn. And once the JobManager and TaskManager launched, the
> jars are
> >> >> > already exists.
> >> >> > For Standalone per-job and K8s, we expect that the user jars
> >> >> > and dependencies are built into the image.
> >> >> > Or the InitContainer could be used for downloading. It is natively
> >> >> > distributed and we will not have bottleneck.
> >> >> >
> >> >> > #2, Job graph recovery
> >> >> > We could have an optimization to store job graph on the DFS.
> However, i
> >> >> > suggest building a new jobgraph
> >> >> > from the configuration is the default option. Since we will not
> always have
> >> >> > a DFS store when deploying a
> >> >> > Flink per-job cluster. Of course, we assume that using the same
> >> >> > configuration(e.g. job_id, user_jar, main_class,
> >> >> > main_args, parallelism, savepoint_settings, etc.) will get a same
> job
> >> >> > graph. I think the standalone per-job
> >> >> > already has the similar behavior.
> >> >> >
> >> >> > #3, What happens with jobs that have multiple execute calls?
> >> >> > Currently, it is really a problem. Even we use a local client on
> Flink
> >> >> > master side, it will have different behavior with
> >> >> > client mode. For client mode, if we execute multiple times, then
> we will
> >> >> > deploy multiple Flink clusters for each execute.
> >> >> > I am not pretty sure whether it is reasonable. However, i still
> think using
> >> >> > the local client is a good choice. We could
> >> >> > continue the discussion in a new thread. @Zili Chen <
> wander4...@gmail.com> Do
> >> >> > you want to drive this?
> >> >> >
> >> >> >
> >> >> >
> >> >> > Best,
> >> >> > Yang
> >> >> >
> >> >> > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月16日周四 上午1:55写道:
> >> >> >
> >> >> > > Hi Kostas,
> >> >> > >
> >> >> > > Thanks for this feedback. I can't agree more about the opinion.
> The
> >> >> > > cluster mode should be added
> >> >> > > first in per job cluster.
> >> >> > >
> >> >> > > 1) For job cluster implementation
> >> >> > > 1. Job graph recovery from configuration or store as static job
> graph as
> >> >> > > session cluster. I think the static one will be better for less
> recovery
> >> >> > > time.
> >> >> > > Let me update the doc for details.
> >> >> > >
> >> >> > > 2. For job execute multiple times, I think @Zili Chen
> >> >> > > <wander4...@gmail.com> has proposed the local client solution
> that can
> >> >> > > the run program actually in the cluster entry point. We can put
> the
> >> >> > > implementation in the second stage,
> >> >> > > or even a new FLIP for further discussion.
> >> >> > >
> >> >> > > 2) For session cluster implementation
> >> >> > > We can disable the cluster mode for the session cluster in the
> first
> >> >> > > stage. I agree the jar downloading will be a painful thing.
> >> >> > > We can consider about PoC and performance evaluation first. If
> the end to
> >> >> > > end experience is good enough, then we can consider
> >> >> > > proceeding with the solution.
> >> >> > >
> >> >> > > Looking forward to more opinions from @Yang Wang <
> danrtsey...@gmail.com> @Zili
> >> >> > > Chen <wander4...@gmail.com> @Dian Fu <dian0511...@gmail.com>.
> >> >> > >
> >> >> > >
> >> >> > > Best Regards
> >> >> > > Peter Huang
> >> >> > >
> >> >> > > On Wed, Jan 15, 2020 at 7:50 AM Kostas Kloudas <
> kklou...@gmail.com> wrote:
> >> >> > >
> >> >> > >> Hi all,
> >> >> > >>
> >> >> > >> I am writing here as the discussion on the Google Doc seems to
> be a
> >> >> > >> bit difficult to follow.
> >> >> > >>
> >> >> > >> I think that in order to be able to make progress, it would be
> helpful
> >> >> > >> to focus on per-job mode for now.
> >> >> > >> The reason is that:
> >> >> > >>  1) making the (unique) JobSubmitHandler responsible for
> creating the
> >> >> > >> jobgraphs,
> >> >> > >>   which includes downloading dependencies, is not an optimal
> solution
> >> >> > >>  2) even if we put the responsibility on the JobMaster,
> currently each
> >> >> > >> job has its own
> >> >> > >>   JobMaster but they all run on the same process, so we have
> again a
> >> >> > >> single entity.
> >> >> > >>
> >> >> > >> Of course after this is done, and if we feel comfortable with
> the
> >> >> > >> solution, then we can go to the session mode.
> >> >> > >>
> >> >> > >> A second comment has to do with fault-tolerance in the per-job,
> >> >> > >> cluster-deploy mode.
> >> >> > >> In the document, it is suggested that upon recovery, the
> JobMaster of
> >> >> > >> each job re-creates the JobGraph.
> >> >> > >> I am just wondering if it is better to create and store the
> jobGraph
> >> >> > >> upon submission and only fetch it
> >> >> > >> upon recovery so that we have a static jobGraph.
> >> >> > >>
> >> >> > >> Finally, I have a question which is what happens with jobs that
> have
> >> >> > >> multiple execute calls?
> >> >> > >> The semantics seem to change compared to the current behaviour,
> right?
> >> >> > >>
> >> >> > >> Cheers,
> >> >> > >> Kostas
> >> >> > >>
> >> >> > >> On Wed, Jan 8, 2020 at 8:05 PM tison <wander4...@gmail.com>
> wrote:
> >> >> > >> >
> >> >> > >> > not always, Yang Wang is also not yet a committer but he can
> join the
> >> >> > >> > channel. I cannot find the id by clicking “Add new member in
> channel” so
> >> >> > >> > come to you and ask for try out the link. Possibly I will
> find other
> >> >> > >> ways
> >> >> > >> > but the original purpose is that the slack channel is a
> public area we
> >> >> > >> > discuss about developing...
> >> >> > >> > Best,
> >> >> > >> > tison.
> >> >> > >> >
> >> >> > >> >
> >> >> > >> > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月9日周四
> 上午2:44写道:
> >> >> > >> >
> >> >> > >> > > Hi Tison,
> >> >> > >> > >
> >> >> > >> > > I am not the committer of Flink yet. I think I can't join
> it also.
> >> >> > >> > >
> >> >> > >> > >
> >> >> > >> > > Best Regards
> >> >> > >> > > Peter Huang
> >> >> > >> > >
> >> >> > >> > > On Wed, Jan 8, 2020 at 9:39 AM tison <wander4...@gmail.com>
> wrote:
> >> >> > >> > >
> >> >> > >> > > > Hi Peter,
> >> >> > >> > > >
> >> >> > >> > > > Could you try out this link?
> >> >> > >> > > https://the-asf.slack.com/messages/CNA3ADZPH
> >> >> > >> > > >
> >> >> > >> > > > Best,
> >> >> > >> > > > tison.
> >> >> > >> > > >
> >> >> > >> > > >
> >> >> > >> > > > Peter Huang <huangzhenqiu0...@gmail.com> 于2020年1月9日周四
> 上午1:22写道:
> >> >> > >> > > >
> >> >> > >> > > > > Hi Tison,
> >> >> > >> > > > >
> >> >> > >> > > > > I can't join the group with shared link. Would you
> please add me
> >> >> > >> into
> >> >> > >> > > the
> >> >> > >> > > > > group? My slack account is huangzhenqiu0825.
> >> >> > >> > > > > Thank you in advance.
> >> >> > >> > > > >
> >> >> > >> > > > >
> >> >> > >> > > > > Best Regards
> >> >> > >> > > > > Peter Huang
> >> >> > >> > > > >
> >> >> > >> > > > > On Wed, Jan 8, 2020 at 12:02 AM tison <
> wander4...@gmail.com>
> >> >> > >> wrote:
> >> >> > >> > > > >
> >> >> > >> > > > > > Hi Peter,
> >> >> > >> > > > > >
> >> >> > >> > > > > > As described above, this effort should get attention
> from people
> >> >> > >> > > > > developing
> >> >> > >> > > > > > FLIP-73 a.k.a. Executor abstractions. I recommend you
> to join
> >> >> > >> the
> >> >> > >> > > > public
> >> >> > >> > > > > > slack channel[1] for Flink Client API Enhancement and
> you can
> >> >> > >> try to
> >> >> > >> > > > > share
> >> >> > >> > > > > > you detailed thoughts there. It possibly gets more
> concrete
> >> >> > >> > > attentions.
> >> >> > >> > > > > >
> >> >> > >> > > > > > Best,
> >> >> > >> > > > > > tison.
> >> >> > >> > > > > >
> >> >> > >> > > > > > [1]
> >> >> > >> > > > > >
> >> >> > >> > > > > >
> >> >> > >> > > > >
> >> >> > >> > > >
> >> >> > >> > >
> >> >> > >>
> https://slack.com/share/IS21SJ75H/Rk8HhUly9FuEHb7oGwBZ33uL/enQtODg2MDYwNjE5MTg3LTA2MjIzNDc1M2ZjZDVlMjdlZjk1M2RkYmJhNjAwMTk2ZDZkODQ4NmY5YmI4OGRhNWJkYTViMTM1NzlmMzc4OWM
> >> >> > >> > > > > >
> >> >> > >> > > > > >
> >> >> > >> > > > > > Peter Huang <huangzhenqiu0...@gmail.com>
> 于2020年1月7日周二 上午5:09写道:
> >> >> > >> > > > > >
> >> >> > >> > > > > > > Dear All,
> >> >> > >> > > > > > >
> >> >> > >> > > > > > > Happy new year! According to existing feedback from
> the
> >> >> > >> community,
> >> >> > >> > > we
> >> >> > >> > > > > > > revised the doc with the consideration of session
> cluster
> >> >> > >> support,
> >> >> > >> > > > and
> >> >> > >> > > > > > > concrete interface changes needed and execution
> plan. Please
> >> >> > >> take
> >> >> > >> > > one
> >> >> > >> > > > > > more
> >> >> > >> > > > > > > round of review at your most convenient time.
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > >
> >> >> > >> > > > >
> >> >> > >> > > >
> >> >> > >> > >
> >> >> > >>
> https://docs.google.com/document/d/1aAwVjdZByA-0CHbgv16Me-vjaaDMCfhX7TzVVTuifYM/edit#
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > > Best Regards
> >> >> > >> > > > > > > Peter Huang
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > >
> >> >> > >> > > > > > > On Thu, Jan 2, 2020 at 11:29 AM Peter Huang <
> >> >> > >> > > > > huangzhenqiu0...@gmail.com>
> >> >> > >> > > > > > > wrote:
> >> >> > >> > > > > > >
> >> >> > >> > > > > > > > Hi Dian,
> >> >> > >> > > > > > > > Thanks for giving us valuable feedbacks.
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > 1) It's better to have a whole design for this
> feature
> >> >> > >> > > > > > > > For the suggestion of enabling the cluster mode
> also session
> >> >> > >> > > > > cluster, I
> >> >> > >> > > > > > > > think Flink already supported it.
> WebSubmissionExtension
> >> >> > >> already
> >> >> > >> > > > > allows
> >> >> > >> > > > > > > > users to start a job with the specified jar by
> using web UI.
> >> >> > >> > > > > > > > But we need to enable the feature from CLI for
> both local
> >> >> > >> jar,
> >> >> > >> > > > remote
> >> >> > >> > > > > > > jar.
> >> >> > >> > > > > > > > I will align with Yang Wang first about the
> details and
> >> >> > >> update
> >> >> > >> > > the
> >> >> > >> > > > > > design
> >> >> > >> > > > > > > > doc.
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > 2) It's better to consider the convenience for
> users, such
> >> >> > >> as
> >> >> > >> > > > > debugging
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > I am wondering whether we can store the exception
> in
> >> >> > >> jobgragh
> >> >> > >> > > > > > > > generation in application master. As no streaming
> graph can
> >> >> > >> be
> >> >> > >> > > > > > scheduled
> >> >> > >> > > > > > > in
> >> >> > >> > > > > > > > this case, there will be no more TM will be
> requested from
> >> >> > >> > > FlinkRM.
> >> >> > >> > > > > > > > If the AM is still running, users can still query
> it from
> >> >> > >> CLI. As
> >> >> > >> > > > it
> >> >> > >> > > > > > > > requires more change, we can get some feedback
> from <
> >> >> > >> > > > > > aljos...@apache.org
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > and @zjf...@gmail.com <zjf...@gmail.com>.
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > 3) It's better to consider the impact to the
> stability of
> >> >> > >> the
> >> >> > >> > > > cluster
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > I agree with Yang Wang's opinion.
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > Best Regards
> >> >> > >> > > > > > > > Peter Huang
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > > On Sun, Dec 29, 2019 at 9:44 PM Dian Fu <
> >> >> > >> dian0511...@gmail.com>
> >> >> > >> > > > > wrote:
> >> >> > >> > > > > > > >
> >> >> > >> > > > > > > >> Hi all,
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> Sorry to jump into this discussion. Thanks
> everyone for the
> >> >> > >> > > > > > discussion.
> >> >> > >> > > > > > > >> I'm very interested in this topic although I'm
> not an
> >> >> > >> expert in
> >> >> > >> > > > this
> >> >> > >> > > > > > > part.
> >> >> > >> > > > > > > >> So I'm glad to share my thoughts as following:
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> 1) It's better to have a whole design for this
> feature
> >> >> > >> > > > > > > >> As we know, there are two deployment modes:
> per-job mode
> >> >> > >> and
> >> >> > >> > > > session
> >> >> > >> > > > > > > >> mode. I'm wondering which mode really needs this
> feature.
> >> >> > >> As the
> >> >> > >> > > > > > design
> >> >> > >> > > > > > > doc
> >> >> > >> > > > > > > >> mentioned, per-job mode is more used for
> streaming jobs and
> >> >> > >> > > > session
> >> >> > >> > > > > > > mode is
> >> >> > >> > > > > > > >> usually used for batch jobs(Of course, the job
> types and
> >> >> > >> the
> >> >> > >> > > > > > deployment
> >> >> > >> > > > > > > >> modes are orthogonal). Usually streaming job is
> only
> >> >> > >> needed to
> >> >> > >> > > be
> >> >> > >> > > > > > > submitted
> >> >> > >> > > > > > > >> once and it will run for days or weeks, while
> batch jobs
> >> >> > >> will be
> >> >> > >> > > > > > > submitted
> >> >> > >> > > > > > > >> more frequently compared with streaming jobs.
> This means
> >> >> > >> that
> >> >> > >> > > > maybe
> >> >> > >> > > > > > > session
> >> >> > >> > > > > > > >> mode also needs this feature. However, if we
> support this
> >> >> > >> > > feature
> >> >> > >> > > > in
> >> >> > >> > > > > > > >> session mode, the application master will become
> the new
> >> >> > >> > > > centralized
> >> >> > >> > > > > > > >> service(which should be solved). So in this
> case, it's
> >> >> > >> better to
> >> >> > >> > > > > have
> >> >> > >> > > > > > a
> >> >> > >> > > > > > > >> complete design for both per-job mode and
> session mode.
> >> >> > >> > > > Furthermore,
> >> >> > >> > > > > > > even
> >> >> > >> > > > > > > >> if we can do it phase by phase, we need to have
> a whole
> >> >> > >> picture
> >> >> > >> > > of
> >> >> > >> > > > > how
> >> >> > >> > > > > > > it
> >> >> > >> > > > > > > >> works in both per-job mode and session mode.
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> 2) It's better to consider the convenience for
> users, such
> >> >> > >> as
> >> >> > >> > > > > > debugging
> >> >> > >> > > > > > > >> After we finish this feature, the job graph will
> be
> >> >> > >> compiled in
> >> >> > >> > > > the
> >> >> > >> > > > > > > >> application master, which means that users
> cannot easily
> >> >> > >> get the
> >> >> > >> > > > > > > exception
> >> >> > >> > > > > > > >> message synchorousely in the job client if there
> are
> >> >> > >> problems
> >> >> > >> > > > during
> >> >> > >> > > > > > the
> >> >> > >> > > > > > > >> job graph compiling (especially for platform
> users), such
> >> >> > >> as the
> >> >> > >> > > > > > > resource
> >> >> > >> > > > > > > >> path is incorrect, the user program itself has
> some
> >> >> > >> problems,
> >> >> > >> > > etc.
> >> >> > >> > > > > > What
> >> >> > >> > > > > > > I'm
> >> >> > >> > > > > > > >> thinking is that maybe we should throw the
> exceptions as
> >> >> > >> early
> >> >> > >> > > as
> >> >> > >> > > > > > > possible
> >> >> > >> > > > > > > >> (during job submission stage).
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> 3) It's better to consider the impact to the
> stability of
> >> >> > >> the
> >> >> > >> > > > > cluster
> >> >> > >> > > > > > > >> If we perform the compiling in the application
> master, we
> >> >> > >> should
> >> >> > >> > > > > > > consider
> >> >> > >> > > > > > > >> the impact of the compiling errors. Although
> YARN could
> >> >> > >> resume
> >> >> > >> > > the
> >> >> > >> > > > > > > >> application master in case of failures, but in
> some case
> >> >> > >> the
> >> >> > >> > > > > compiling
> >> >> > >> > > > > > > >> failure may be a waste of cluster resource and
> may impact
> >> >> > >> the
> >> >> > >> > > > > > stability
> >> >> > >> > > > > > > the
> >> >> > >> > > > > > > >> cluster and the other jobs in the cluster, such
> as the
> >> >> > >> resource
> >> >> > >> > > > path
> >> >> > >> > > > > > is
> >> >> > >> > > > > > > >> incorrect, the user program itself has some
> problems(in
> >> >> > >> this
> >> >> > >> > > case,
> >> >> > >> > > > > job
> >> >> > >> > > > > > > >> failover cannot solve this kind of problems)
> etc. In the
> >> >> > >> current
> >> >> > >> > > > > > > >> implemention, the compiling errors are handled
> in the
> >> >> > >> client
> >> >> > >> > > side
> >> >> > >> > > > > and
> >> >> > >> > > > > > > there
> >> >> > >> > > > > > > >> is no impact to the cluster at all.
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> Regarding to 1), it's clearly pointed in the
> design doc
> >> >> > >> that
> >> >> > >> > > only
> >> >> > >> > > > > > > per-job
> >> >> > >> > > > > > > >> mode will be supported. However, I think it's
> better to
> >> >> > >> also
> >> >> > >> > > > > consider
> >> >> > >> > > > > > > the
> >> >> > >> > > > > > > >> session mode in the design doc.
> >> >> > >> > > > > > > >> Regarding to 2) and 3), I have not seen related
> sections
> >> >> > >> in the
> >> >> > >> > > > > design
> >> >> > >> > > > > > > >> doc. It will be good if we can cover them in the
> design
> >> >> > >> doc.
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> Feel free to correct me If there is anything I
> >> >> > >> misunderstand.
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> Regards,
> >> >> > >> > > > > > > >> Dian
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >> > 在 2019年12月27日,上午3:13,Peter Huang <
> >> >> > >> huangzhenqiu0...@gmail.com>
> >> >> > >> > > > 写道:
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > Hi Yang,
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > I can't agree more. The effort definitely
> needs to align
> >> >> > >> with
> >> >> > >> > > > the
> >> >> > >> > > > > > > final
> >> >> > >> > > > > > > >> > goal of FLIP-73.
> >> >> > >> > > > > > > >> > I am thinking about whether we can achieve the
> goal with
> >> >> > >> two
> >> >> > >> > > > > phases.
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > 1) Phase I
> >> >> > >> > > > > > > >> > As the CLiFrontend will not be depreciated
> soon. We can
> >> >> > >> still
> >> >> > >> > > > use
> >> >> > >> > > > > > the
> >> >> > >> > > > > > > >> > deployMode flag there,
> >> >> > >> > > > > > > >> > pass the program info through Flink
> configuration,  use
> >> >> > >> the
> >> >> > >> > > > > > > >> > ClassPathJobGraphRetriever
> >> >> > >> > > > > > > >> > to generate the job graph in
> ClusterEntrypoints of yarn
> >> >> > >> and
> >> >> > >> > > > > > > Kubernetes.
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > 2) Phase II
> >> >> > >> > > > > > > >> > In  AbstractJobClusterExecutor, the job graph
> is
> >> >> > >> generated in
> >> >> > >> > > > the
> >> >> > >> > > > > > > >> execute
> >> >> > >> > > > > > > >> > function. We can still
> >> >> > >> > > > > > > >> > use the deployMode in it. With deployMode =
> cluster, the
> >> >> > >> > > execute
> >> >> > >> > > > > > > >> function
> >> >> > >> > > > > > > >> > only starts the cluster.
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > When {Yarn/Kuberneates}PerJobClusterEntrypoint
> starts,
> >> >> > >> It will
> >> >> > >> > > > > start
> >> >> > >> > > > > > > the
> >> >> > >> > > > > > > >> > dispatch first, then we can use
> >> >> > >> > > > > > > >> > a ClusterEnvironment similar to
> ContextEnvironment to
> >> >> > >> submit
> >> >> > >> > > the
> >> >> > >> > > > > job
> >> >> > >> > > > > > > >> with
> >> >> > >> > > > > > > >> > jobName the local
> >> >> > >> > > > > > > >> > dispatcher. For the details, we need more
> investigation.
> >> >> > >> Let's
> >> >> > >> > > > > wait
> >> >> > >> > > > > > > >> > for @Aljoscha
> >> >> > >> > > > > > > >> > Krettek <aljos...@apache.org> @Till Rohrmann <
> >> >> > >> > > > > trohrm...@apache.org
> >> >> > >> > > > > > >'s
> >> >> > >> > > > > > > >> > feedback after the holiday season.
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > Thank you in advance. Merry Chrismas and Happy
> New
> >> >> > >> Year!!!
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > Best Regards
> >> >> > >> > > > > > > >> > Peter Huang
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> > On Wed, Dec 25, 2019 at 1:08 AM Yang Wang <
> >> >> > >> > > > danrtsey...@gmail.com>
> >> >> > >> > > > > > > >> wrote:
> >> >> > >> > > > > > > >> >
> >> >> > >> > > > > > > >> >> Hi Peter,
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >> I think we need to reconsider tison's
> suggestion
> >> >> > >> seriously.
> >> >> > >> > > > After
> >> >> > >> > > > > > > >> FLIP-73,
> >> >> > >> > > > > > > >> >> the deployJobCluster has
> >> >> > >> > > > > > > >> >> beenmoved into `JobClusterExecutor#execute`.
> It should
> >> >> > >> not be
> >> >> > >> > > > > > > perceived
> >> >> > >> > > > > > > >> >> for `CliFrontend`. That
> >> >> > >> > > > > > > >> >> means the user program will *ALWAYS* be
> executed on
> >> >> > >> client
> >> >> > >> > > > side.
> >> >> > >> > > > > > This
> >> >> > >> > > > > > > >> is
> >> >> > >> > > > > > > >> >> the by design behavior.
> >> >> > >> > > > > > > >> >> So, we could not just add `if(client mode) ..
> else
> >> >> > >> if(cluster
> >> >> > >> > > > > mode)
> >> >> > >> > > > > > > >> ...`
> >> >> > >> > > > > > > >> >> codes in `CliFrontend` to bypass
> >> >> > >> > > > > > > >> >> the executor. We need to find a clean way to
> decouple
> >> >> > >> > > executing
> >> >> > >> > > > > > user
> >> >> > >> > > > > > > >> >> program and deploying per-job
> >> >> > >> > > > > > > >> >> cluster. Based on this, we could support to
> execute user
> >> >> > >> > > > program
> >> >> > >> > > > > on
> >> >> > >> > > > > > > >> client
> >> >> > >> > > > > > > >> >> or master side.
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >> Maybe Aljoscha and Jeff could give some good
> >> >> > >> suggestions.
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >> Best,
> >> >> > >> > > > > > > >> >> Yang
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >> Peter Huang <huangzhenqiu0...@gmail.com>
> 于2019年12月25日周三
> >> >> > >> > > > > 上午4:03写道:
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >> >>> Hi Jingjing,
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>> The improvement proposed is a deployment
> option for
> >> >> > >> CLI. For
> >> >> > >> > > > SQL
> >> >> > >> > > > > > > based
> >> >> > >> > > > > > > >> >>> Flink application, It is more convenient to
> use the
> >> >> > >> existing
> >> >> > >> > > > > model
> >> >> > >> > > > > > > in
> >> >> > >> > > > > > > >> >>> SqlClient in which
> >> >> > >> > > > > > > >> >>> the job graph is generated within SqlClient.
> After
> >> >> > >> adding
> >> >> > >> > > the
> >> >> > >> > > > > > > delayed
> >> >> > >> > > > > > > >> job
> >> >> > >> > > > > > > >> >>> graph generation, I think there is no change
> is needed
> >> >> > >> for
> >> >> > >> > > > your
> >> >> > >> > > > > > > side.
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>> Best Regards
> >> >> > >> > > > > > > >> >>> Peter Huang
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>> On Wed, Dec 18, 2019 at 6:01 AM jingjing bai
> <
> >> >> > >> > > > > > > >> baijingjing7...@gmail.com>
> >> >> > >> > > > > > > >> >>> wrote:
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>>> hi peter:
> >> >> > >> > > > > > > >> >>>>    we had extension SqlClent to support sql
> job
> >> >> > >> submit in
> >> >> > >> > > web
> >> >> > >> > > > > > base
> >> >> > >> > > > > > > on
> >> >> > >> > > > > > > >> >>>> flink 1.9.   we support submit to yarn on
> per job
> >> >> > >> mode too.
> >> >> > >> > > > > > > >> >>>>    in this case, the job graph generated
> on client
> >> >> > >> side
> >> >> > >> > > .  I
> >> >> > >> > > > > > think
> >> >> > >> > > > > > > >> >>> this
> >> >> > >> > > > > > > >> >>>> discuss Mainly to improve api programme.
> but in my
> >> >> > >> case ,
> >> >> > >> > > > > there
> >> >> > >> > > > > > is
> >> >> > >> > > > > > > >> no
> >> >> > >> > > > > > > >> >>>> jar to upload but only a sql string .
> >> >> > >> > > > > > > >> >>>>    do u had more suggestion to improve for
> sql mode
> >> >> > >> or it
> >> >> > >> > > is
> >> >> > >> > > > > > only a
> >> >> > >> > > > > > > >> >>>> switch for api programme?
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>> best
> >> >> > >> > > > > > > >> >>>> bai jj
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>> Yang Wang <danrtsey...@gmail.com>
> 于2019年12月18日周三
> >> >> > >> 下午7:21写道:
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>>> I just want to revive this discussion.
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>> Recently, i am thinking about how to
> natively run
> >> >> > >> flink
> >> >> > >> > > > > per-job
> >> >> > >> > > > > > > >> >>> cluster on
> >> >> > >> > > > > > > >> >>>>> Kubernetes.
> >> >> > >> > > > > > > >> >>>>> The per-job mode on Kubernetes is very
> different
> >> >> > >> from on
> >> >> > >> > > > Yarn.
> >> >> > >> > > > > > And
> >> >> > >> > > > > > > >> we
> >> >> > >> > > > > > > >> >>> will
> >> >> > >> > > > > > > >> >>>>> have
> >> >> > >> > > > > > > >> >>>>> the same deployment requirements to the
> client and
> >> >> > >> entry
> >> >> > >> > > > > point.
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>> 1. Flink client not always need a local
> jar to start
> >> >> > >> a
> >> >> > >> > > Flink
> >> >> > >> > > > > > > per-job
> >> >> > >> > > > > > > >> >>>>> cluster. We could
> >> >> > >> > > > > > > >> >>>>> support multiple schemas. For example,
> >> >> > >> > > > file:///path/of/my.jar
> >> >> > >> > > > > > > means
> >> >> > >> > > > > > > >> a
> >> >> > >> > > > > > > >> >>> jar
> >> >> > >> > > > > > > >> >>>>> located
> >> >> > >> > > > > > > >> >>>>> at client side,
> >> >> > >> hdfs://myhdfs/user/myname/flink/my.jar
> >> >> > >> > > > means a
> >> >> > >> > > > > > jar
> >> >> > >> > > > > > > >> >>> located
> >> >> > >> > > > > > > >> >>>>> at
> >> >> > >> > > > > > > >> >>>>> remote hdfs, local:///path/in/image/my.jar
> means a
> >> >> > >> jar
> >> >> > >> > > > located
> >> >> > >> > > > > > at
> >> >> > >> > > > > > > >> >>>>> jobmanager side.
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>> 2. Support running user program on master
> side. This
> >> >> > >> also
> >> >> > >> > > > > means
> >> >> > >> > > > > > > the
> >> >> > >> > > > > > > >> >>> entry
> >> >> > >> > > > > > > >> >>>>> point
> >> >> > >> > > > > > > >> >>>>> will generate the job graph on master
> side. We could
> >> >> > >> use
> >> >> > >> > > the
> >> >> > >> > > > > > > >> >>>>> ClasspathJobGraphRetriever
> >> >> > >> > > > > > > >> >>>>> or start a local Flink client to achieve
> this
> >> >> > >> purpose.
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>> cc tison, Aljoscha & Kostas Do you think
> this is the
> >> >> > >> right
> >> >> > >> > > > > > > >> direction we
> >> >> > >> > > > > > > >> >>>>> need to work?
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>> tison <wander4...@gmail.com>
> 于2019年12月12日周四
> >> >> > >> 下午4:48写道:
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>>> A quick idea is that we separate the
> deployment
> >> >> > >> from user
> >> >> > >> > > > > > program
> >> >> > >> > > > > > > >> >>> that
> >> >> > >> > > > > > > >> >>>>> it
> >> >> > >> > > > > > > >> >>>>>> has always been done
> >> >> > >> > > > > > > >> >>>>>> outside the program. On user program
> executed there
> >> >> > >> is
> >> >> > >> > > > > always a
> >> >> > >> > > > > > > >> >>>>>> ClusterClient that communicates with
> >> >> > >> > > > > > > >> >>>>>> an existing cluster, remote or local. It
> will be
> >> >> > >> another
> >> >> > >> > > > > thread
> >> >> > >> > > > > > > so
> >> >> > >> > > > > > > >> >>> just
> >> >> > >> > > > > > > >> >>>>> for
> >> >> > >> > > > > > > >> >>>>>> your information.
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>> Best,
> >> >> > >> > > > > > > >> >>>>>> tison.
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>> tison <wander4...@gmail.com>
> 于2019年12月12日周四
> >> >> > >> 下午4:40写道:
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>>> Hi Peter,
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>> Another concern I realized recently is
> that with
> >> >> > >> current
> >> >> > >> > > > > > > Executors
> >> >> > >> > > > > > > >> >>>>>>> abstraction(FLIP-73)
> >> >> > >> > > > > > > >> >>>>>>> I'm afraid that user program is designed
> to ALWAYS
> >> >> > >> run
> >> >> > >> > > on
> >> >> > >> > > > > the
> >> >> > >> > > > > > > >> >>> client
> >> >> > >> > > > > > > >> >>>>>> side.
> >> >> > >> > > > > > > >> >>>>>>> Specifically,
> >> >> > >> > > > > > > >> >>>>>>> we deploy the job in executor when
> env.execute
> >> >> > >> called.
> >> >> > >> > > > This
> >> >> > >> > > > > > > >> >>>>> abstraction
> >> >> > >> > > > > > > >> >>>>>>> possibly prevents
> >> >> > >> > > > > > > >> >>>>>>> Flink runs user program on the cluster
> side.
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>> For your proposal, in this case we
> already
> >> >> > >> compiled the
> >> >> > >> > > > > > program
> >> >> > >> > > > > > > >> and
> >> >> > >> > > > > > > >> >>>>> run
> >> >> > >> > > > > > > >> >>>>>> on
> >> >> > >> > > > > > > >> >>>>>>> the client side,
> >> >> > >> > > > > > > >> >>>>>>> even we deploy a cluster and retrieve
> job graph
> >> >> > >> from
> >> >> > >> > > > program
> >> >> > >> > > > > > > >> >>>>> metadata, it
> >> >> > >> > > > > > > >> >>>>>>> doesn't make
> >> >> > >> > > > > > > >> >>>>>>> many sense.
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>> cc Aljoscha & Kostas what do you think
> about this
> >> >> > >> > > > > constraint?
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>> Best,
> >> >> > >> > > > > > > >> >>>>>>> tison.
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>> Peter Huang <huangzhenqiu0...@gmail.com>
> >> >> > >> 于2019年12月10日周二
> >> >> > >> > > > > > > >> 下午12:45写道:
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>> Hi Tison,
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>> Yes, you are right. I think I made the
> wrong
> >> >> > >> argument
> >> >> > >> > > in
> >> >> > >> > > > > the
> >> >> > >> > > > > > > doc.
> >> >> > >> > > > > > > >> >>>>>>>> Basically, the packaging jar problem is
> only for
> >> >> > >> > > platform
> >> >> > >> > > > > > > users.
> >> >> > >> > > > > > > >> >>> In
> >> >> > >> > > > > > > >> >>>>> our
> >> >> > >> > > > > > > >> >>>>>>>> internal deploy service,
> >> >> > >> > > > > > > >> >>>>>>>> we further optimized the deployment
> latency by
> >> >> > >> letting
> >> >> > >> > > > > users
> >> >> > >> > > > > > to
> >> >> > >> > > > > > > >> >>>>>> packaging
> >> >> > >> > > > > > > >> >>>>>>>> flink-runtime together with the uber
> jar, so that
> >> >> > >> we
> >> >> > >> > > > don't
> >> >> > >> > > > > > need
> >> >> > >> > > > > > > >> to
> >> >> > >> > > > > > > >> >>>>>>>> consider
> >> >> > >> > > > > > > >> >>>>>>>> multiple flink version
> >> >> > >> > > > > > > >> >>>>>>>> support for now. In the session client
> mode, as
> >> >> > >> Flink
> >> >> > >> > > > libs
> >> >> > >> > > > > > will
> >> >> > >> > > > > > > >> be
> >> >> > >> > > > > > > >> >>>>>> shipped
> >> >> > >> > > > > > > >> >>>>>>>> anyway as local resources of yarn.
> Users actually
> >> >> > >> don't
> >> >> > >> > > > > need
> >> >> > >> > > > > > to
> >> >> > >> > > > > > > >> >>>>> package
> >> >> > >> > > > > > > >> >>>>>>>> those libs into job jar.
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>> Best Regards
> >> >> > >> > > > > > > >> >>>>>>>> Peter Huang
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>> On Mon, Dec 9, 2019 at 8:35 PM tison <
> >> >> > >> > > > wander4...@gmail.com
> >> >> > >> > > > > >
> >> >> > >> > > > > > > >> >>> wrote:
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the
> package? Do users
> >> >> > >> need
> >> >> > >> > > to
> >> >> > >> > > > > > > >> >>> compile
> >> >> > >> > > > > > > >> >>>>>> their
> >> >> > >> > > > > > > >> >>>>>>>>> jars
> >> >> > >> > > > > > > >> >>>>>>>>> inlcuding flink-clients,
> flink-optimizer,
> >> >> > >> flink-table
> >> >> > >> > > > > codes?
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>> The answer should be no because they
> exist in
> >> >> > >> system
> >> >> > >> > > > > > > classpath.
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>> Best,
> >> >> > >> > > > > > > >> >>>>>>>>> tison.
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>> Yang Wang <danrtsey...@gmail.com>
> 于2019年12月10日周二
> >> >> > >> > > > > 下午12:18写道:
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> Hi Peter,
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> Thanks a lot for starting this
> discussion. I
> >> >> > >> think
> >> >> > >> > > this
> >> >> > >> > > > > is
> >> >> > >> > > > > > a
> >> >> > >> > > > > > > >> >>> very
> >> >> > >> > > > > > > >> >>>>>>>> useful
> >> >> > >> > > > > > > >> >>>>>>>>>> feature.
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> Not only for Yarn, i am focused on
> flink on
> >> >> > >> > > Kubernetes
> >> >> > >> > > > > > > >> >>>>> integration
> >> >> > >> > > > > > > >> >>>>>> and
> >> >> > >> > > > > > > >> >>>>>>>>> come
> >> >> > >> > > > > > > >> >>>>>>>>>> across the same
> >> >> > >> > > > > > > >> >>>>>>>>>> problem. I do not want the job graph
> generated
> >> >> > >> on
> >> >> > >> > > > client
> >> >> > >> > > > > > > side.
> >> >> > >> > > > > > > >> >>>>>>>> Instead,
> >> >> > >> > > > > > > >> >>>>>>>>> the
> >> >> > >> > > > > > > >> >>>>>>>>>> user jars are built in
> >> >> > >> > > > > > > >> >>>>>>>>>> a user-defined image. When the job
> manager
> >> >> > >> launched,
> >> >> > >> > > we
> >> >> > >> > > > > > just
> >> >> > >> > > > > > > >> >>>>> need to
> >> >> > >> > > > > > > >> >>>>>>>>>> generate the job graph
> >> >> > >> > > > > > > >> >>>>>>>>>> based on local user jars.
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> I have some small suggestion about
> this.
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> 1. `ProgramJobGraphRetriever` is very
> similar to
> >> >> > >> > > > > > > >> >>>>>>>>>> `ClasspathJobGraphRetriever`, the
> differences
> >> >> > >> > > > > > > >> >>>>>>>>>> are the former needs
> `ProgramMetadata` and the
> >> >> > >> latter
> >> >> > >> > > > > needs
> >> >> > >> > > > > > > >> >>> some
> >> >> > >> > > > > > > >> >>>>>>>>> arguments.
> >> >> > >> > > > > > > >> >>>>>>>>>> Is it possible to
> >> >> > >> > > > > > > >> >>>>>>>>>> have an unified `JobGraphRetriever`
> to support
> >> >> > >> both?
> >> >> > >> > > > > > > >> >>>>>>>>>> 2. Is it possible to not use a local
> user jar to
> >> >> > >> > > start
> >> >> > >> > > > a
> >> >> > >> > > > > > > >> >>> per-job
> >> >> > >> > > > > > > >> >>>>>>>> cluster?
> >> >> > >> > > > > > > >> >>>>>>>>>> In your case, the user jars has
> >> >> > >> > > > > > > >> >>>>>>>>>> existed on hdfs already and we do
> need to
> >> >> > >> download
> >> >> > >> > > the
> >> >> > >> > > > > jars
> >> >> > >> > > > > > > to
> >> >> > >> > > > > > > >> >>>>>>>> deployer
> >> >> > >> > > > > > > >> >>>>>>>>>> service. Currently, we
> >> >> > >> > > > > > > >> >>>>>>>>>> always need a local user jar to start
> a flink
> >> >> > >> > > cluster.
> >> >> > >> > > > It
> >> >> > >> > > > > > is
> >> >> > >> > > > > > > >> >>> be
> >> >> > >> > > > > > > >> >>>>>> great
> >> >> > >> > > > > > > >> >>>>>>>> if
> >> >> > >> > > > > > > >> >>>>>>>>> we
> >> >> > >> > > > > > > >> >>>>>>>>>> could support remote user jars.
> >> >> > >> > > > > > > >> >>>>>>>>>>>> In the implementation, we assume
> users package
> >> >> > >> > > > > > > >> >>> flink-clients,
> >> >> > >> > > > > > > >> >>>>>>>>>> flink-optimizer, flink-table together
> within
> >> >> > >> the job
> >> >> > >> > > > jar.
> >> >> > >> > > > > > > >> >>>>> Otherwise,
> >> >> > >> > > > > > > >> >>>>>>>> the
> >> >> > >> > > > > > > >> >>>>>>>>>> job graph generation within
> >> >> > >> JobClusterEntryPoint will
> >> >> > >> > > > > fail.
> >> >> > >> > > > > > > >> >>>>>>>>>> 3. What do you mean about the
> package? Do users
> >> >> > >> need
> >> >> > >> > > to
> >> >> > >> > > > > > > >> >>> compile
> >> >> > >> > > > > > > >> >>>>>> their
> >> >> > >> > > > > > > >> >>>>>>>>> jars
> >> >> > >> > > > > > > >> >>>>>>>>>> inlcuding flink-clients,
> flink-optimizer,
> >> >> > >> flink-table
> >> >> > >> > > > > > codes?
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> Best,
> >> >> > >> > > > > > > >> >>>>>>>>>> Yang
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>> Peter Huang <
> huangzhenqiu0...@gmail.com>
> >> >> > >> > > > 于2019年12月10日周二
> >> >> > >> > > > > > > >> >>>>> 上午2:37写道:
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>> Dear All,
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>> Recently, the Flink community starts
> to
> >> >> > >> improve the
> >> >> > >> > > > yarn
> >> >> > >> > > > > > > >> >>>>> cluster
> >> >> > >> > > > > > > >> >>>>>>>>>> descriptor
> >> >> > >> > > > > > > >> >>>>>>>>>>> to make job jar and config files
> configurable
> >> >> > >> from
> >> >> > >> > > > CLI.
> >> >> > >> > > > > It
> >> >> > >> > > > > > > >> >>>>>> improves
> >> >> > >> > > > > > > >> >>>>>>>> the
> >> >> > >> > > > > > > >> >>>>>>>>>>> flexibility of  Flink deployment
> Yarn Per Job
> >> >> > >> Mode.
> >> >> > >> > > > For
> >> >> > >> > > > > > > >> >>>>> platform
> >> >> > >> > > > > > > >> >>>>>>>> users
> >> >> > >> > > > > > > >> >>>>>>>>>> who
> >> >> > >> > > > > > > >> >>>>>>>>>>> manage tens of hundreds of streaming
> pipelines
> >> >> > >> for
> >> >> > >> > > the
> >> >> > >> > > > > > whole
> >> >> > >> > > > > > > >> >>>>> org
> >> >> > >> > > > > > > >> >>>>>> or
> >> >> > >> > > > > > > >> >>>>>>>>>>> company, we found the job graph
> generation in
> >> >> > >> > > > > client-side
> >> >> > >> > > > > > is
> >> >> > >> > > > > > > >> >>>>>> another
> >> >> > >> > > > > > > >> >>>>>>>>>>> pinpoint. Thus, we want to propose a
> >> >> > >> configurable
> >> >> > >> > > > > feature
> >> >> > >> > > > > > > >> >>> for
> >> >> > >> > > > > > > >> >>>>>>>>>>> FlinkYarnSessionCli. The feature can
> allow
> >> >> > >> users to
> >> >> > >> > > > > choose
> >> >> > >> > > > > > > >> >>> the
> >> >> > >> > > > > > > >> >>>>> job
> >> >> > >> > > > > > > >> >>>>>>>>> graph
> >> >> > >> > > > > > > >> >>>>>>>>>>> generation in Flink
> ClusterEntryPoint so that
> >> >> > >> the
> >> >> > >> > > job
> >> >> > >> > > > > jar
> >> >> > >> > > > > > > >> >>>>> doesn't
> >> >> > >> > > > > > > >> >>>>>>>> need
> >> >> > >> > > > > > > >> >>>>>>>>> to
> >> >> > >> > > > > > > >> >>>>>>>>>>> be locally for the job graph
> generation. The
> >> >> > >> > > proposal
> >> >> > >> > > > is
> >> >> > >> > > > > > > >> >>>>> organized
> >> >> > >> > > > > > > >> >>>>>>>> as a
> >> >> > >> > > > > > > >> >>>>>>>>>>> FLIP
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > >
> >> >> > >> > > > > >
> >> >> > >> > > > >
> >> >> > >> > > >
> >> >> > >> > >
> >> >> > >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
> >> >> > >> > > > > > > >> >>>>>>>>>>> .
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>> Any questions and suggestions are
> welcomed.
> >> >> > >> Thank
> >> >> > >> > > you
> >> >> > >> > > > in
> >> >> > >> > > > > > > >> >>>>> advance.
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>> Best Regards
> >> >> > >> > > > > > > >> >>>>>>>>>>> Peter Huang
> >> >> > >> > > > > > > >> >>>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>>
> >> >> > >> > > > > > > >> >>>>>>>
> >> >> > >> > > > > > > >> >>>>>>
> >> >> > >> > > > > > > >> >>>>>
> >> >> > >> > > > > > > >> >>>>
> >> >> > >> > > > > > > >> >>>
> >> >> > >> > > > > > > >> >>
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > > >>
> >> >> > >> > > > > > >
> >> >> > >> > > > > >
> >> >> > >> > > > >
> >> >> > >> > > >
> >> >> > >> > >
> >> >> > >>
> >> >> > >
> >> >>
>

Reply via email to