Hi Tison,

Thanks for the FLIP and launching the discussion!

As a first note, big +1 on providing/exposing a JobClient to the users!

Some points that would be nice to be clarified:
1) You mention that we can get rid of the DETACHED mode: I agree that
at a high level, given that everything will now be asynchronous, there
is no need to keep the DETACHED mode but I think we should specify
some aspects. For example, without the explicit separation of the
modes, what happens when the job finishes. Does the client
periodically poll for the result always or the result is pushed when
in NON-DETACHED mode? What happens if the client disconnects and
reconnects?

2) On the "how to retrieve a JobClient for a running Job", I think
this is related to the other discussion you opened in the ML about
multi-layered clients. First of all, I agree that exposing different
"levels" of clients would be a nice addition, and actually there have
been some discussions about doing so in the future. Now for this
specific discussion:
      i) I do not think that we should expose the
ClusterDescriptor/ClusterSpecification to the user, as this ties us to
a specific architecture which may change in the future.
     ii) I do not think it should be the Executor that will provide a
JobClient for an already running job (only for the Jobs that it
submits). The job of the executor should just be to execute() a
pipeline.
     iii) I think a solution that respects the separation of concerns
could be the addition of another component (in the future), something
like a ClientFactory, or ClusterFactory that will have methods like:
ClusterClient createCluster(Configuration), JobClient
retrieveJobClient(Configuration , JobId), maybe even (although not
sure) Executor getExecutor(Configuration ) and maybe more. This
component would be responsible to interact with a cluster manager like
Yarn and do what is now being done by the ClusterDescriptor plus some
more stuff.

Although under the hood all these abstractions (Environments,
Executors, ...) underneath use the same clients, I believe their
job/existence is not contradicting but they simply hide some of the
complexity from the user, and give us, as developers some freedom to
change in the future some of the parts. For example, the executor will
take a Pipeline, create a JobGraph and submit it, instead of requiring
the user to do each step separately. This allows us to, for example,
get rid of the Plan if in the future everything is DataStream.
Essentially, I think of these as layers of an onion with the clients
being close to the core. The higher you go, the more functionality is
included and hidden from the public eye.

Point iii) by the way is just a thought and by no means final. I also
like the idea of multi-layered clients so this may spark up the
discussion.

Cheers,
Kostas

On Wed, Sep 25, 2019 at 2:21 PM Aljoscha Krettek <aljos...@apache.org> wrote:
>
> Hi Tison,
>
> Thanks for proposing the document! I had some comments on the document.
>
> I think the only complex thing that we still need to figure out is how to get 
> a JobClient for a job that is already running. As you mentioned in the 
> document. Currently I’m thinking that its ok to add a method to Executor for 
> retrieving a JobClient for a running job by providing an ID. Let’s see what 
> Kostas has to say on the topic.
>
> Best,
> Aljoscha
>
> > On 25. Sep 2019, at 12:31, Zili Chen <wander4...@gmail.com> wrote:
> >
> > Hi all,
> >
> > Summary from the discussion about introducing Flink JobClient API[1] we
> > draft FLIP-74[2] to
> > gather thoughts and towards a standard public user-facing interfaces.
> >
> > This discussion thread aims at standardizing job level client API. But I'd
> > like to emphasize that
> > how to retrieve JobClient possibly causes further discussion on different
> > level clients exposed from
> > Flink so that a following thread will be started later to coordinate
> > FLIP-73 and FLIP-74 on
> > expose issue.
> >
> > Looking forward to your opinions.
> >
> > Best,
> > tison.
> >
> > [1]
> > https://lists.apache.org/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E
> > [2]
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-74%3A+Flink+JobClient+API
>

Reply via email to