BTW, correct me if I misunderstand, now I learn more about our community way. Since FLIP-73 aimed at introducing an interface with community consensus the discussion is more about the interface in order to properly define a useful and extensible API. The integration story could be a follow up since this one does not affect current behavior at all.
Best, tison. Zili Chen <wander4...@gmail.com> 于2019年10月3日周四 上午2:02写道: > Hi Kostas, > > It seems does no harm we have a configuration parameter of Executor#execute > since we can merge this one with the one configured on Executor created and > let this one overwhelm that one. > > I can see it is useful that conceptually we can create an Executor for a > series jobs > to the same cluster but with different job configuration per pipeline. > > Best, > tison. > > > Kostas Kloudas <kklou...@apache.org> 于2019年10月3日周四 上午1:37写道: > >> Hi again, >> >> I did not include this to my previous email, as this is related to the >> proposal on the FLIP itself. >> >> In the existing proposal, the Executor interface is the following. >> >> public interface Executor { >> >> JobExecutionResult execute(Pipeline pipeline) throws Exception; >> >> } >> >> This implies that all the necessary information for the execution of a >> Pipeline should be included in the Configuration passed in the >> ExecutorFactory which instantiates the Executor itself. This should >> include, for example, all the parameters currently supplied by the >> ProgramOptions, which are conceptually not executor parameters but >> rather parameters for the execution of the specific pipeline. To this >> end, I would like to propose a change in the current Executor >> interface showcased below: >> >> >> public interface Executor { >> >> JobExecutionResult execute(Pipeline pipeline, Configuration >> executionOptions) throws Exception; >> >> } >> >> The above will allow to have the Executor specific options passed in >> the configuration given during executor instantiation, while the >> pipeline specific options can be passed in the executionOptions. As a >> positive side-effect, this will make Executors re-usable, i.e. >> instantiate an executor and use it to execute multiple pipelines, if >> in the future we choose to do so. >> >> Let me know what do you think, >> Kostas >> >> On Wed, Oct 2, 2019 at 7:23 PM Kostas Kloudas <kklou...@apache.org> >> wrote: >> > >> > Hi all, >> > >> > I agree with Tison that we should disentangle threads so that people >> > can work independently. >> > >> > For FLIP-73: >> > - for Preview/OptimizedPlanEnv: I think they are orthogonal to the >> > Executors work, as they are using the exexute() method because this is >> > the only "entry" to the user program. To this regard, I believe we >> > should just see the fact that they have their dedicated environment as >> > an "implementation detail". >> > - for getting rid of the per-job mode: as a first note, there was >> > already a discussion here: >> > >> https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E >> > with many people, including myself, expressing their opinion. I am >> > mentioning that to show that this topic already has some history and >> > the discussin does not start from scratch but there are already some >> > contradicting opinions. My opinion is that we should not get rid of >> > the per-job mode but I agree that we should discuss about the >> > semantics in more detail. Although in terms of code it may be tempting >> > to "merge" the two submission modes, one of the main benefits of the >> > per-job mode is isolation, both for resources and security, as the >> > jobGraph to be executed is fixed and the cluster is "locked" just for >> > that specific graph. This would be violated by having a session >> > cluster launched and having all the infrastrucutre (ports and >> > endpoints) set for submittting to that cluster any job. >> > - for getting rid of the "detached" mode: I agree with getting rid of >> > it but this implies some potential user-facing changes that should be >> > discussed. >> > >> > Given the above, I think that: >> > 1) in the context of FLIP-73 we should not change any semantics but >> > simply push the existing submission logic behind a reusable >> > abstraction and make it usable via public APIs, as Aljoscha said. >> > 2) as Till said, changing the semantics is beyond the scope of this >> > FLIP and as Tison mentioned we should work towards decoupling >> > discussions rather than the opposite. So let's discuss about the >> > future of the per-job and detached modes in a separate thread. This >> > will also allow to give the proper visibility to such an important >> > topic. >> > >> > Cheers, >> > Kostas >> > >> > On Wed, Oct 2, 2019 at 4:40 PM Zili Chen <wander4...@gmail.com> wrote: >> > > >> > > Thanks for your thoughts Aljoscha. >> > > >> > > Another question since FLIP-73 might contains refactors on >> Environemnt: >> > > shall we support >> > > something like PreviewPlanEnvironment? If so, how? From a user >> perspective >> > > preview plan >> > > is useful, by give visual view, to modify topos and configure without >> > > submit it. >> > > >> > > Best, >> > > tison. >> > > >> > > >> > > Aljoscha Krettek <aljos...@apache.org> 于2019年10月2日周三 下午10:10写道: >> > > >> > > > I agree with Till that we should not change the semantics of >> per-job mode. >> > > > In my opinion per-job mode means that the cluster (JobManager) is >> brought >> > > > up with one job and it only executes that one job. There should be >> no open >> > > > ports/anything that would allow submitting further jobs. This is >> very >> > > > important for deployments in docker/Kubernetes or other >> environments were >> > > > you bring up jobs without necessarily having the notion of a Flink >> cluster. >> > > > >> > > > What this means for a user program that has multiple execute() >> calls is >> > > > that you will get a fresh cluster for each execute call. This also >> means, >> > > > that further execute() calls will only happen if the “client” is >> still >> > > > alive, because it is the one driving execution. Currently, this >> only works >> > > > if you start the job in “attached” mode. If you start in “detached” >> mode >> > > > only the first execute() will happen and the rest will be ignored. >> > > > >> > > > This brings us to the tricky question about what to do about >> “detached” >> > > > and “attached”. In the long run, I would like to get rid of the >> distinction >> > > > and leave it up to the user program, by either blocking or not on >> the >> > > > Future (or JobClient or whatnot) that job submission returns. This, >> > > > however, means that users cannot simply request “detached” >> execution when >> > > > using bin/flink, the user program has to “play along”. On the other >> hand, >> > > > “detached” mode is quite strange for the user program. The >> execute() call >> > > > either returns with a proper job result after the job ran (in >> “attached” >> > > > mode) or with a dummy result (in “detached” mode) right after >> submission. I >> > > > think this can even lead to weird cases where multiple "execute()” >> run in >> > > > parallel. For per-job detached mode we also “throw” out of the first >> > > > execute so the rest (including result processing logic) is ignored. >> > > > >> > > > For this here FLIP-73 we can (and should) ignore these problems, >> because >> > > > FLIP-73 only moves the existing submission logic behind a reusable >> > > > abstraction and makes it usable via API. We should closely follow >> up on the >> > > > above points though because I think they are also important. >> > > > >> > > > Best, >> > > > Aljoscha >> > > > >> > > > > On 2. Oct 2019, at 12:08, Zili Chen <wander4...@gmail.com> wrote: >> > > > > >> > > > > Thanks for your clarification Till. >> > > > > >> > > > > I agree with the current semantics of the per-job mode, one should >> > > > deploy a >> > > > > new cluster for each part of the job. Apart from the performance >> concern >> > > > > it also means that PerJobExecutor knows how to deploy a cluster >> actually, >> > > > > which is different from the description that Executor submit a >> job. >> > > > > >> > > > > Anyway it sounds workable and narrow the changes. >> > > > >> > > > >> >