Hi Peter, I'm afraid that FLIP-73 also changes how per-job works. Please check the work first. You can search AbstractJobClusterExecutor and its call graph.
For how it influences your proposal FLIP-85, I already mentioned above that >user program is designed to ALWAYS run on the client side. Specifically, >we deploy the job in executor when env.execute called. This abstraction possibly prevents >Flink runs user program on the cluster side. Best, tison. Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月19日周四 上午2:54写道: > Hi Yang, > > Thanks for your input, I can see the master side job graph generation is a > common requirement for per job mode. > I think FLIP-73 is mainly for session mode. I think the proposal is a > valid improvement for existing CLI and per job mode. > > > Best Regards > Peter Huang > > On Wed, Dec 18, 2019 at 3:21 AM Yang Wang <danrtsey...@gmail.com> wrote: > >> I just want to revive this discussion. >> >> Recently, i am thinking about how to natively run flink per-job cluster on >> Kubernetes. >> The per-job mode on Kubernetes is very different from on Yarn. And we will >> have >> the same deployment requirements to the client and entry point. >> >> 1. Flink client not always need a local jar to start a Flink per-job >> cluster. We could >> support multiple schemas. For example, file:///path/of/my.jar means a jar >> located >> at client side, hdfs://myhdfs/user/myname/flink/my.jar means a jar located >> at >> remote hdfs, local:///path/in/image/my.jar means a jar located at >> jobmanager side. >> >> 2. Support running user program on master side. This also means the entry >> point >> will generate the job graph on master side. We could use the >> ClasspathJobGraphRetriever >> or start a local Flink client to achieve this purpose. >> >> >> cc tison, Aljoscha & Kostas Do you think this is the right direction we >> need to work? >> >> tison <wander4...@gmail.com> 于2019年12月12日周四 下午4:48写道: >> >> > A quick idea is that we separate the deployment from user program that >> it >> > has always been done >> > outside the program. On user program executed there is always a >> > ClusterClient that communicates with >> > an existing cluster, remote or local. It will be another thread so just >> for >> > your information. >> > >> > Best, >> > tison. >> > >> > >> > tison <wander4...@gmail.com> 于2019年12月12日周四 下午4:40写道: >> > >> > > Hi Peter, >> > > >> > > Another concern I realized recently is that with current Executors >> > > abstraction(FLIP-73) >> > > I'm afraid that user program is designed to ALWAYS run on the client >> > side. >> > > Specifically, >> > > we deploy the job in executor when env.execute called. This >> abstraction >> > > possibly prevents >> > > Flink runs user program on the cluster side. >> > > >> > > For your proposal, in this case we already compiled the program and >> run >> > on >> > > the client side, >> > > even we deploy a cluster and retrieve job graph from program >> metadata, it >> > > doesn't make >> > > many sense. >> > > >> > > cc Aljoscha & Kostas what do you think about this constraint? >> > > >> > > Best, >> > > tison. >> > > >> > > >> > > Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月10日周二 下午12:45写道: >> > > >> > >> Hi Tison, >> > >> >> > >> Yes, you are right. I think I made the wrong argument in the doc. >> > >> Basically, the packaging jar problem is only for platform users. In >> our >> > >> internal deploy service, >> > >> we further optimized the deployment latency by letting users to >> > packaging >> > >> flink-runtime together with the uber jar, so that we don't need to >> > >> consider >> > >> multiple flink version >> > >> support for now. In the session client mode, as Flink libs will be >> > shipped >> > >> anyway as local resources of yarn. Users actually don't need to >> package >> > >> those libs into job jar. >> > >> >> > >> >> > >> >> > >> Best Regards >> > >> Peter Huang >> > >> >> > >> On Mon, Dec 9, 2019 at 8:35 PM tison <wander4...@gmail.com> wrote: >> > >> >> > >> > > 3. What do you mean about the package? Do users need to compile >> > their >> > >> > jars >> > >> > inlcuding flink-clients, flink-optimizer, flink-table codes? >> > >> > >> > >> > The answer should be no because they exist in system classpath. >> > >> > >> > >> > Best, >> > >> > tison. >> > >> > >> > >> > >> > >> > Yang Wang <danrtsey...@gmail.com> 于2019年12月10日周二 下午12:18写道: >> > >> > >> > >> > > Hi Peter, >> > >> > > >> > >> > > Thanks a lot for starting this discussion. I think this is a very >> > >> useful >> > >> > > feature. >> > >> > > >> > >> > > Not only for Yarn, i am focused on flink on Kubernetes >> integration >> > and >> > >> > come >> > >> > > across the same >> > >> > > problem. I do not want the job graph generated on client side. >> > >> Instead, >> > >> > the >> > >> > > user jars are built in >> > >> > > a user-defined image. When the job manager launched, we just >> need to >> > >> > > generate the job graph >> > >> > > based on local user jars. >> > >> > > >> > >> > > I have some small suggestion about this. >> > >> > > >> > >> > > 1. `ProgramJobGraphRetriever` is very similar to >> > >> > > `ClasspathJobGraphRetriever`, the differences >> > >> > > are the former needs `ProgramMetadata` and the latter needs some >> > >> > arguments. >> > >> > > Is it possible to >> > >> > > have an unified `JobGraphRetriever` to support both? >> > >> > > 2. Is it possible to not use a local user jar to start a per-job >> > >> cluster? >> > >> > > In your case, the user jars has >> > >> > > existed on hdfs already and we do need to download the jars to >> > >> deployer >> > >> > > service. Currently, we >> > >> > > always need a local user jar to start a flink cluster. It is be >> > great >> > >> if >> > >> > we >> > >> > > could support remote user jars. >> > >> > > >> In the implementation, we assume users package flink-clients, >> > >> > > flink-optimizer, flink-table together within the job jar. >> Otherwise, >> > >> the >> > >> > > job graph generation within JobClusterEntryPoint will fail. >> > >> > > 3. What do you mean about the package? Do users need to compile >> > their >> > >> > jars >> > >> > > inlcuding flink-clients, flink-optimizer, flink-table codes? >> > >> > > >> > >> > > >> > >> > > >> > >> > > Best, >> > >> > > Yang >> > >> > > >> > >> > > Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月10日周二 >> 上午2:37写道: >> > >> > > >> > >> > > > Dear All, >> > >> > > > >> > >> > > > Recently, the Flink community starts to improve the yarn >> cluster >> > >> > > descriptor >> > >> > > > to make job jar and config files configurable from CLI. It >> > improves >> > >> the >> > >> > > > flexibility of Flink deployment Yarn Per Job Mode. For >> platform >> > >> users >> > >> > > who >> > >> > > > manage tens of hundreds of streaming pipelines for the whole >> org >> > or >> > >> > > > company, we found the job graph generation in client-side is >> > another >> > >> > > > pinpoint. Thus, we want to propose a configurable feature for >> > >> > > > FlinkYarnSessionCli. The feature can allow users to choose the >> job >> > >> > graph >> > >> > > > generation in Flink ClusterEntryPoint so that the job jar >> doesn't >> > >> need >> > >> > to >> > >> > > > be locally for the job graph generation. The proposal is >> organized >> > >> as a >> > >> > > > FLIP >> > >> > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation >> > >> > > > . >> > >> > > > >> > >> > > > Any questions and suggestions are welcomed. Thank you in >> advance. >> > >> > > > >> > >> > > > >> > >> > > > Best Regards >> > >> > > > Peter Huang >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >