Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

Peter Huang Mon, 09 Dec 2019 20:45:39 -0800

Hi Tison,

Yes, you are right. I think I made the wrong argument in the doc.
Basically, the packaging jar problem is only for platform users. In our
internal deploy service,
we further optimized the deployment latency by letting users to packaging
flink-runtime together with the uber jar, so that we don't need to consider
multiple flink version
support for now. In the session client mode, as Flink libs will be shipped
anyway as local resources of yarn. Users actually don't need to package
those libs into job jar.




Best Regards
Peter Huang

On Mon, Dec 9, 2019 at 8:35 PM tison <[email protected]> wrote:

> > 3. What do you mean about the package? Do users need to compile their
> jars
> inlcuding flink-clients, flink-optimizer, flink-table codes?
>
> The answer should be no because they exist in system classpath.
>
> Best,
> tison.
>
>
> Yang Wang <[email protected]> 于2019年12月10日周二 下午12:18写道：
>
> > Hi Peter,
> >
> > Thanks a lot for starting this discussion. I think this is a very useful
> > feature.
> >
> > Not only for Yarn, i am focused on flink on Kubernetes integration and
> come
> > across the same
> > problem. I do not want the job graph generated on client side. Instead,
> the
> > user jars are built in
> > a user-defined image. When the job manager launched, we just need to
> > generate the job graph
> > based on local user jars.
> >
> > I have some small suggestion about this.
> >
> > 1. `ProgramJobGraphRetriever` is very similar to
> > `ClasspathJobGraphRetriever`, the differences
> > are the former needs `ProgramMetadata` and the latter needs some
> arguments.
> > Is it possible to
> > have an unified `JobGraphRetriever` to support both?
> > 2. Is it possible to not use a local user jar to start a per-job cluster?
> > In your case, the user jars has
> > existed on hdfs already and we do need to download the jars to deployer
> > service. Currently, we
> > always need a local user jar to start a flink cluster. It is be great if
> we
> > could support remote user jars.
> > >> In the implementation, we assume users package flink-clients,
> > flink-optimizer, flink-table together within the job jar. Otherwise, the
> > job graph generation within JobClusterEntryPoint will fail.
> > 3. What do you mean about the package? Do users need to compile their
> jars
> > inlcuding flink-clients, flink-optimizer, flink-table codes?
> >
> >
> >
> > Best,
> > Yang
> >
> > Peter Huang <[email protected]> 于2019年12月10日周二 上午2:37写道：
> >
> > > Dear All,
> > >
> > > Recently, the Flink community starts to improve the yarn cluster
> > descriptor
> > > to make job jar and config files configurable from CLI. It improves the
> > > flexibility of  Flink deployment Yarn Per Job Mode. For platform users
> > who
> > > manage tens of hundreds of streaming pipelines for the whole org or
> > > company, we found the job graph generation in client-side is another
> > > pinpoint. Thus, we want to propose a configurable feature for
> > > FlinkYarnSessionCli. The feature can allow users to choose the job
> graph
> > > generation in Flink ClusterEntryPoint so that the job jar doesn't need
> to
> > > be locally for the job graph generation. The proposal is organized as a
> > > FLIP
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation
> > > .
> > >
> > > Any questions and suggestions are welcomed. Thank you in advance.
> > >
> > >
> > > Best Regards
> > > Peter Huang
> > >
> >
>

Re: [DISCUSS] FLIP-85: Delayed Job Graph Generation

Reply via email to