Hi Yang, I understand your point. As for Kubernates per job cluster, users only have the image path for starting the job. The user code is inaccessible. I think it is a common question for containerized deployment (For example yanr with docker image) after FLIP-73. Let's get some feedback from < aljos...@apache.org> and @zjf...@gmail.com <zjf...@gmail.com>.
Best Regards Peter Huang On Mon, Dec 30, 2019 at 1:48 AM Yang Wang <danrtsey...@gmail.com> wrote: > Hi Peter, > Certainly, we could add a 'if-else' in `AbstractJobClusterExecutor` to > handle different deploy mode. However, i > think we need to avoid executing any user program code in cluster > deploy-mode including in the `ExecutionEnvironment`. > Let's wait for some feedback from FLIP-73's author @Aljoscha Krettek > <aljos...@apache.org> and @zjf...@gmail.com <zjf...@gmail.com>. > > > Hi Dian Fu, > > Many thanks for jumping out and give the very useful suggestions. > > >> 1) It's better to have a whole design for this feature > You are right. We should not add a specific config option > "execution.deploy-mode" for per-job. In per-job > mode, the job graph will be generated in JobClusterEntrypoint when the > deploy-mode is cluster. The standalone > per-job has already done by `ClasspathJobGraphRetriever`. In session mode, > it will be more complicated. We > should not generate the job graph in entrypoint. Instead, for each job, we > need to do it separately and then submit > the job by a local client. Peter and i will try to enrich the design doc of > this part. As for the implementation, we could > do it phase by phase. > > >> 2) It's better to consider the convenience for users, such as debugging > Indeed, if the deploy-mode is cluster, it may be not convenient for user > debugging. For different cluster, there > is different ways to debugging. For example, using 'yarn logs' and 'kubectl > log' to get the jobmanager logs. Also > we could consider to throw the exception to client by rest. I'm not sure > whether we could achieve this purpose. > Compared to the client deploy-model, it is really a fallback in user > experience. We will try to add more description in > the document about the user experience. > > >> 3) It's better to consider the impact to the stability of the cluster > I do not think it will take too many negative impacts to the cluster. Yarn, > Kubernetes and other resource management > cluster could give a good isolation for different applications . One app > failed should not affect others. If error occurs in > generating job graph, the jobmanager process will fail very fast and the > whole app will deregister after several attempts. > We cannot just avoid this, even in client deploy-mode, it could also happen > when the user specify a wrong checkpoint path. > > > > Best, > Yang > > Dian Fu <dian0511...@gmail.com> 于2019年12月30日周一 下午1:44写道: > > > Hi all, > > > > Sorry to jump into this discussion. Thanks everyone for the discussion. > > I'm very interested in this topic although I'm not an expert in this > part. > > So I'm glad to share my thoughts as following: > > > > 1) It's better to have a whole design for this feature > > As we know, there are two deployment modes: per-job mode and session > mode. > > I'm wondering which mode really needs this feature. As the design doc > > mentioned, per-job mode is more used for streaming jobs and session mode > is > > usually used for batch jobs(Of course, the job types and the deployment > > modes are orthogonal). Usually streaming job is only needed to be > submitted > > once and it will run for days or weeks, while batch jobs will be > submitted > > more frequently compared with streaming jobs. This means that maybe > session > > mode also needs this feature. However, if we support this feature in > > session mode, the application master will become the new centralized > > service(which should be solved). So in this case, it's better to have a > > complete design for both per-job mode and session mode. Furthermore, even > > if we can do it phase by phase, we need to have a whole picture of how it > > works in both per-job mode and session mode. > > > > 2) It's better to consider the convenience for users, such as debugging > > After we finish this feature, the job graph will be compiled in the > > application master, which means that users cannot easily get the > exception > > message synchorousely in the job client if there are problems during the > > job graph compiling (especially for platform users), such as the resource > > path is incorrect, the user program itself has some problems, etc. What > I'm > > thinking is that maybe we should throw the exceptions as early as > possible > > (during job submission stage). > > > > 3) It's better to consider the impact to the stability of the cluster > > If we perform the compiling in the application master, we should consider > > the impact of the compiling errors. Although YARN could resume the > > application master in case of failures, but in some case the compiling > > failure may be a waste of cluster resource and may impact the stability > the > > cluster and the other jobs in the cluster, such as the resource path is > > incorrect, the user program itself has some problems(in this case, job > > failover cannot solve this kind of problems) etc. In the current > > implemention, the compiling errors are handled in the client side and > there > > is no impact to the cluster at all. > > > > Regarding to 1), it's clearly pointed in the design doc that only per-job > > mode will be supported. However, I think it's better to also consider the > > session mode in the design doc. > > Regarding to 2) and 3), I have not seen related sections in the design > > doc. It will be good if we can cover them in the design doc. > > > > Feel free to correct me If there is anything I misunderstand. > > > > Regards, > > Dian > > > > > > > 在 2019年12月27日,上午3:13,Peter Huang <huangzhenqiu0...@gmail.com> 写道: > > > > > > Hi Yang, > > > > > > I can't agree more. The effort definitely needs to align with the final > > > goal of FLIP-73. > > > I am thinking about whether we can achieve the goal with two phases. > > > > > > 1) Phase I > > > As the CLiFrontend will not be depreciated soon. We can still use the > > > deployMode flag there, > > > pass the program info through Flink configuration, use the > > > ClassPathJobGraphRetriever > > > to generate the job graph in ClusterEntrypoints of yarn and Kubernetes. > > > > > > 2) Phase II > > > In AbstractJobClusterExecutor, the job graph is generated in the > execute > > > function. We can still > > > use the deployMode in it. With deployMode = cluster, the execute > function > > > only starts the cluster. > > > > > > When {Yarn/Kuberneates}PerJobClusterEntrypoint starts, It will start > the > > > dispatch first, then we can use > > > a ClusterEnvironment similar to ContextEnvironment to submit the job > with > > > jobName the local > > > dispatcher. For the details, we need more investigation. Let's wait > > > for @Aljoscha > > > Krettek <aljos...@apache.org> @Till Rohrmann <trohrm...@apache.org>'s > > > feedback after the holiday season. > > > > > > Thank you in advance. Merry Chrismas and Happy New Year!!! > > > > > > > > > Best Regards > > > Peter Huang > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Dec 25, 2019 at 1:08 AM Yang Wang <danrtsey...@gmail.com> > wrote: > > > > > >> Hi Peter, > > >> > > >> I think we need to reconsider tison's suggestion seriously. After > > FLIP-73, > > >> the deployJobCluster has > > >> beenmoved into `JobClusterExecutor#execute`. It should not be > perceived > > >> for `CliFrontend`. That > > >> means the user program will *ALWAYS* be executed on client side. This > is > > >> the by design behavior. > > >> So, we could not just add `if(client mode) .. else if(cluster mode) > ...` > > >> codes in `CliFrontend` to bypass > > >> the executor. We need to find a clean way to decouple executing user > > >> program and deploying per-job > > >> cluster. Based on this, we could support to execute user program on > > client > > >> or master side. > > >> > > >> Maybe Aljoscha and Jeff could give some good suggestions. > > >> > > >> > > >> > > >> Best, > > >> Yang > > >> > > >> Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月25日周三 上午4:03写道: > > >> > > >>> Hi Jingjing, > > >>> > > >>> The improvement proposed is a deployment option for CLI. For SQL > based > > >>> Flink application, It is more convenient to use the existing model in > > >>> SqlClient in which > > >>> the job graph is generated within SqlClient. After adding the delayed > > job > > >>> graph generation, I think there is no change is needed for your side. > > >>> > > >>> > > >>> Best Regards > > >>> Peter Huang > > >>> > > >>> > > >>> On Wed, Dec 18, 2019 at 6:01 AM jingjing bai < > > baijingjing7...@gmail.com> > > >>> wrote: > > >>> > > >>>> hi peter: > > >>>> we had extension SqlClent to support sql job submit in web base > on > > >>>> flink 1.9. we support submit to yarn on per job mode too. > > >>>> in this case, the job graph generated on client side . I think > > >>> this > > >>>> discuss Mainly to improve api programme. but in my case , there is > no > > >>>> jar to upload but only a sql string . > > >>>> do u had more suggestion to improve for sql mode or it is only a > > >>>> switch for api programme? > > >>>> > > >>>> > > >>>> best > > >>>> bai jj > > >>>> > > >>>> > > >>>> Yang Wang <danrtsey...@gmail.com> 于2019年12月18日周三 下午7:21写道: > > >>>> > > >>>>> I just want to revive this discussion. > > >>>>> > > >>>>> Recently, i am thinking about how to natively run flink per-job > > >>> cluster on > > >>>>> Kubernetes. > > >>>>> The per-job mode on Kubernetes is very different from on Yarn. And > we > > >>> will > > >>>>> have > > >>>>> the same deployment requirements to the client and entry point. > > >>>>> > > >>>>> 1. Flink client not always need a local jar to start a Flink > per-job > > >>>>> cluster. We could > > >>>>> support multiple schemas. For example, file:///path/of/my.jar > means a > > >>> jar > > >>>>> located > > >>>>> at client side, hdfs://myhdfs/user/myname/flink/my.jar means a jar > > >>> located > > >>>>> at > > >>>>> remote hdfs, local:///path/in/image/my.jar means a jar located at > > >>>>> jobmanager side. > > >>>>> > > >>>>> 2. Support running user program on master side. This also means the > > >>> entry > > >>>>> point > > >>>>> will generate the job graph on master side. We could use the > > >>>>> ClasspathJobGraphRetriever > > >>>>> or start a local Flink client to achieve this purpose. > > >>>>> > > >>>>> > > >>>>> cc tison, Aljoscha & Kostas Do you think this is the right > direction > > we > > >>>>> need to work? > > >>>>> > > >>>>> tison <wander4...@gmail.com> 于2019年12月12日周四 下午4:48写道: > > >>>>> > > >>>>>> A quick idea is that we separate the deployment from user program > > >>> that > > >>>>> it > > >>>>>> has always been done > > >>>>>> outside the program. On user program executed there is always a > > >>>>>> ClusterClient that communicates with > > >>>>>> an existing cluster, remote or local. It will be another thread so > > >>> just > > >>>>> for > > >>>>>> your information. > > >>>>>> > > >>>>>> Best, > > >>>>>> tison. > > >>>>>> > > >>>>>> > > >>>>>> tison <wander4...@gmail.com> 于2019年12月12日周四 下午4:40写道: > > >>>>>> > > >>>>>>> Hi Peter, > > >>>>>>> > > >>>>>>> Another concern I realized recently is that with current > Executors > > >>>>>>> abstraction(FLIP-73) > > >>>>>>> I'm afraid that user program is designed to ALWAYS run on the > > >>> client > > >>>>>> side. > > >>>>>>> Specifically, > > >>>>>>> we deploy the job in executor when env.execute called. This > > >>>>> abstraction > > >>>>>>> possibly prevents > > >>>>>>> Flink runs user program on the cluster side. > > >>>>>>> > > >>>>>>> For your proposal, in this case we already compiled the program > and > > >>>>> run > > >>>>>> on > > >>>>>>> the client side, > > >>>>>>> even we deploy a cluster and retrieve job graph from program > > >>>>> metadata, it > > >>>>>>> doesn't make > > >>>>>>> many sense. > > >>>>>>> > > >>>>>>> cc Aljoscha & Kostas what do you think about this constraint? > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> tison. > > >>>>>>> > > >>>>>>> > > >>>>>>> Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月10日周二 > 下午12:45写道: > > >>>>>>> > > >>>>>>>> Hi Tison, > > >>>>>>>> > > >>>>>>>> Yes, you are right. I think I made the wrong argument in the > doc. > > >>>>>>>> Basically, the packaging jar problem is only for platform users. > > >>> In > > >>>>> our > > >>>>>>>> internal deploy service, > > >>>>>>>> we further optimized the deployment latency by letting users to > > >>>>>> packaging > > >>>>>>>> flink-runtime together with the uber jar, so that we don't need > to > > >>>>>>>> consider > > >>>>>>>> multiple flink version > > >>>>>>>> support for now. In the session client mode, as Flink libs will > be > > >>>>>> shipped > > >>>>>>>> anyway as local resources of yarn. Users actually don't need to > > >>>>> package > > >>>>>>>> those libs into job jar. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> Best Regards > > >>>>>>>> Peter Huang > > >>>>>>>> > > >>>>>>>> On Mon, Dec 9, 2019 at 8:35 PM tison <wander4...@gmail.com> > > >>> wrote: > > >>>>>>>> > > >>>>>>>>>> 3. What do you mean about the package? Do users need to > > >>> compile > > >>>>>> their > > >>>>>>>>> jars > > >>>>>>>>> inlcuding flink-clients, flink-optimizer, flink-table codes? > > >>>>>>>>> > > >>>>>>>>> The answer should be no because they exist in system classpath. > > >>>>>>>>> > > >>>>>>>>> Best, > > >>>>>>>>> tison. > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> Yang Wang <danrtsey...@gmail.com> 于2019年12月10日周二 下午12:18写道: > > >>>>>>>>> > > >>>>>>>>>> Hi Peter, > > >>>>>>>>>> > > >>>>>>>>>> Thanks a lot for starting this discussion. I think this is a > > >>> very > > >>>>>>>> useful > > >>>>>>>>>> feature. > > >>>>>>>>>> > > >>>>>>>>>> Not only for Yarn, i am focused on flink on Kubernetes > > >>>>> integration > > >>>>>> and > > >>>>>>>>> come > > >>>>>>>>>> across the same > > >>>>>>>>>> problem. I do not want the job graph generated on client side. > > >>>>>>>> Instead, > > >>>>>>>>> the > > >>>>>>>>>> user jars are built in > > >>>>>>>>>> a user-defined image. When the job manager launched, we just > > >>>>> need to > > >>>>>>>>>> generate the job graph > > >>>>>>>>>> based on local user jars. > > >>>>>>>>>> > > >>>>>>>>>> I have some small suggestion about this. > > >>>>>>>>>> > > >>>>>>>>>> 1. `ProgramJobGraphRetriever` is very similar to > > >>>>>>>>>> `ClasspathJobGraphRetriever`, the differences > > >>>>>>>>>> are the former needs `ProgramMetadata` and the latter needs > > >>> some > > >>>>>>>>> arguments. > > >>>>>>>>>> Is it possible to > > >>>>>>>>>> have an unified `JobGraphRetriever` to support both? > > >>>>>>>>>> 2. Is it possible to not use a local user jar to start a > > >>> per-job > > >>>>>>>> cluster? > > >>>>>>>>>> In your case, the user jars has > > >>>>>>>>>> existed on hdfs already and we do need to download the jars to > > >>>>>>>> deployer > > >>>>>>>>>> service. Currently, we > > >>>>>>>>>> always need a local user jar to start a flink cluster. It is > > >>> be > > >>>>>> great > > >>>>>>>> if > > >>>>>>>>> we > > >>>>>>>>>> could support remote user jars. > > >>>>>>>>>>>> In the implementation, we assume users package > > >>> flink-clients, > > >>>>>>>>>> flink-optimizer, flink-table together within the job jar. > > >>>>> Otherwise, > > >>>>>>>> the > > >>>>>>>>>> job graph generation within JobClusterEntryPoint will fail. > > >>>>>>>>>> 3. What do you mean about the package? Do users need to > > >>> compile > > >>>>>> their > > >>>>>>>>> jars > > >>>>>>>>>> inlcuding flink-clients, flink-optimizer, flink-table codes? > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> Best, > > >>>>>>>>>> Yang > > >>>>>>>>>> > > >>>>>>>>>> Peter Huang <huangzhenqiu0...@gmail.com> 于2019年12月10日周二 > > >>>>> 上午2:37写道: > > >>>>>>>>>> > > >>>>>>>>>>> Dear All, > > >>>>>>>>>>> > > >>>>>>>>>>> Recently, the Flink community starts to improve the yarn > > >>>>> cluster > > >>>>>>>>>> descriptor > > >>>>>>>>>>> to make job jar and config files configurable from CLI. It > > >>>>>> improves > > >>>>>>>> the > > >>>>>>>>>>> flexibility of Flink deployment Yarn Per Job Mode. For > > >>>>> platform > > >>>>>>>> users > > >>>>>>>>>> who > > >>>>>>>>>>> manage tens of hundreds of streaming pipelines for the whole > > >>>>> org > > >>>>>> or > > >>>>>>>>>>> company, we found the job graph generation in client-side is > > >>>>>> another > > >>>>>>>>>>> pinpoint. Thus, we want to propose a configurable feature > > >>> for > > >>>>>>>>>>> FlinkYarnSessionCli. The feature can allow users to choose > > >>> the > > >>>>> job > > >>>>>>>>> graph > > >>>>>>>>>>> generation in Flink ClusterEntryPoint so that the job jar > > >>>>> doesn't > > >>>>>>>> need > > >>>>>>>>> to > > >>>>>>>>>>> be locally for the job graph generation. The proposal is > > >>>>> organized > > >>>>>>>> as a > > >>>>>>>>>>> FLIP > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>>> > > >>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-85+Delayed+JobGraph+Generation > > >>>>>>>>>>> . > > >>>>>>>>>>> > > >>>>>>>>>>> Any questions and suggestions are welcomed. Thank you in > > >>>>> advance. > > >>>>>>>>>>> > > >>>>>>>>>>> > > >>>>>>>>>>> Best Regards > > >>>>>>>>>>> Peter Huang > > >>>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >> > > > > >