Re: [DISCUSS] FLIP-73: Introducing Executors for job submission

Kostas Kloudas Wed, 02 Oct 2019 10:24:38 -0700

Hi all,

I agree with Tison that we should disentangle threads so that people
can work independently.


For FLIP-73:
 - for Preview/OptimizedPlanEnv: I think they are orthogonal to the
Executors work, as they are using the exexute() method because this is
the only "entry" to the user program. To this regard, I believe we
should just see the fact that they have their dedicated environment as
an "implementation detail".
 - for getting rid of the per-job mode: as a first note, there was
already a discussion here:
https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
with many people, including myself, expressing their opinion. I am
mentioning that to show that this topic already has some history and
the discussin does not start from scratch but there are already some
contradicting opinions. My opinion is that we should not get rid of
the per-job mode but I agree that we should discuss about the
semantics in more detail. Although in terms of code it may be tempting
to "merge" the two submission modes, one of the main benefits of the
per-job mode is isolation, both for resources and security, as the
jobGraph to be executed is fixed and the cluster is "locked" just for
that specific graph. This would be violated by having a session
cluster launched and having all the infrastrucutre (ports and
endpoints) set for submittting to that cluster any job.
- for getting rid of the "detached" mode: I agree with getting rid of
it but this implies some potential user-facing changes that should be
discussed.

Given the above, I think that:
1) in the context of FLIP-73 we should not change any semantics but
simply push the existing submission logic behind a reusable
abstraction and make it usable via public APIs, as Aljoscha said.
2) as Till said, changing the semantics is beyond the scope of this
FLIP and as Tison mentioned we should work towards decoupling
discussions rather than the opposite. So let's discuss about the
future of the per-job and detached modes in a separate thread. This
will also allow to give the proper visibility to such an important
topic.

Cheers,
Kostas

On Wed, Oct 2, 2019 at 4:40 PM Zili Chen <wander4...@gmail.com> wrote:
>
> Thanks for your thoughts Aljoscha.
>
> Another question since FLIP-73 might contains refactors on Environemnt:
> shall we support
> something like PreviewPlanEnvironment? If so, how? From a user perspective
> preview plan
> is useful, by give visual view, to modify topos and configure without
> submit it.
>
> Best,
> tison.
>
>
> Aljoscha Krettek <aljos...@apache.org> 于2019年10月2日周三 下午10:10写道：
>
> > I agree with Till that we should not change the semantics of per-job mode.
> > In my opinion per-job mode means that the cluster (JobManager) is brought
> > up with one job and it only executes that one job. There should be no open
> > ports/anything that would allow submitting further jobs. This is very
> > important for deployments in docker/Kubernetes or other environments were
> > you bring up jobs without necessarily having the notion of a Flink cluster.
> >
> > What this means for a user program that has multiple execute() calls is
> > that you will get a fresh cluster for each execute call. This also means,
> > that further execute() calls will only happen if the “client” is still
> > alive, because it is the one driving execution. Currently, this only works
> > if you start the job in “attached” mode. If you start in “detached” mode
> > only the first execute() will happen and the rest will be ignored.
> >
> > This brings us to the tricky question about what to do about “detached”
> > and “attached”. In the long run, I would like to get rid of the distinction
> > and leave it up to the user program, by either blocking or not on the
> > Future (or JobClient or whatnot) that job submission returns. This,
> > however, means that users cannot simply request “detached” execution when
> > using bin/flink, the user program has to “play along”. On the other hand,
> > “detached” mode is quite strange for the user program. The execute() call
> > either returns with a proper job result after the job ran (in “attached”
> > mode) or with a dummy result (in “detached” mode) right after submission. I
> > think this can even lead to weird cases where multiple "execute()” run in
> > parallel. For per-job detached mode we also “throw” out of the first
> > execute so the rest (including result processing logic) is ignored.
> >
> > For this here FLIP-73 we can (and should) ignore these problems, because
> > FLIP-73 only moves the existing submission logic behind a reusable
> > abstraction and makes it usable via API. We should closely follow up on the
> > above points though because I think they are also important.
> >
> > Best,
> > Aljoscha
> >
> > > On 2. Oct 2019, at 12:08, Zili Chen <wander4...@gmail.com> wrote:
> > >
> > > Thanks for your clarification Till.
> > >
> > > I agree with the current semantics of the per-job mode, one should
> > deploy a
> > > new cluster for each part of the job. Apart from the performance concern
> > > it also means that PerJobExecutor knows how to deploy a cluster actually,
> > > which is different from the description that Executor submit a job.
> > >
> > > Anyway it sounds workable and narrow the changes.
> >
> >

Re: [DISCUSS] FLIP-73: Introducing Executors for job submission

Reply via email to