Re: [Discuss] Add JobListener (hook) in flink job lifecycle

Till Rohrmann Tue, 23 Apr 2019 01:24:19 -0700

I think we should not expose the ClusterClient configuration via the
ExecutionEnvironment (env.getClusterClient().addJobListener) because this
is effectively the same as exposing the JobListener interface directly on
the ExecutionEnvironment. Instead I think it could be possible to provide a
ClusterClient factory which is picked up from the Configuration or some
other mechanism for example. That way it would not need to be exposed via
the ExecutionEnvironment at all.


Cheers,
Till

On Fri, Apr 19, 2019 at 11:12 AM Jeff Zhang <zjf...@gmail.com> wrote:

> >>>  The ExecutionEnvironment is usually used by the user who writes the
> code and this person (I assume) would not be really interested in these
> callbacks.
>
> Usually ExecutionEnvironment is used by the user who write the code, but
> it doesn't needs to be created and configured by this person. e.g. in
> Zeppelin notebook, ExecutionEnvironment is created by Zeppelin, user just
> use ExecutionEnvironment to write flink program.  You are right that the
> end user would not be interested in these callback, but the third party
> library that integrate with zeppelin would be interested in these callbacks.
>
> >>> In your case, it could be sufficient to offer some hooks for the
> ClusterClient or being able to provide a custom ClusterClient.
>
> Actually in my initial PR (https://github.com/apache/flink/pull/8190), I
> do pass JobListener to ClusterClient and invoke it there.
> But IMHO, ClusterClient is not supposed be a public api for users. Instead
> JobClient is the public api that user should use to control job. So adding
> hooks to ClusterClient directly and provide a custom ClusterClient doesn't
> make sense to me. IIUC, you are suggesting the following approach
>      env.getClusterClient().addJobListener(jobListener)
> but I don't see its benefit compared to this.
>      env.addJobListener(jobListener)
>
> Overall, I think adding hooks is orthogonal with fine grained job
> control. And I agree that we should refactor the flink client component,
> but I don't think it would affect the JobListener interface. What do you
> think ?
>
>
>
>
> Till Rohrmann <trohrm...@apache.org> 于2019年4月18日周四 下午8:57写道：
>
>> Thanks for starting this discussion Jeff. I can see the need for
>> additional hooks for third party integrations.
>>
>> The thing I'm wondering is whether we really need/want to expose a
>> JobListener via the ExecutionEnvironment. The ExecutionEnvironment is
>> usually used by the user who writes the code and this person (I assume)
>> would not be really interested in these callbacks. If he would, then one
>> should rather think about a better programmatic job control where the
>> `ExecutionEnvironment#execute` call returns a `JobClient` instance.
>> Moreover, we would effectively make this part of the public API and every
>> implementation would need to offer it.
>>
>> In your case, it could be sufficient to offer some hooks for the
>> ClusterClient or being able to provide a custom ClusterClient. The
>> ClusterClient is the component responsible for the job submission and
>> retrieval of the job result and, hence, would be able to signal when a job
>> has been submitted or completed.
>>
>> Cheers,
>> Till
>>
>> On Thu, Apr 18, 2019 at 8:57 AM vino yang <yanghua1...@gmail.com> wrote:
>>
>>> Hi Jeff,
>>>
>>> I personally like this proposal. From the perspective of
>>> programmability, the JobListener can make the third program more
>>> appreciable.
>>>
>>> The scene where I need the listener is the Flink cube engine for Apache
>>> Kylin. In the case, the Flink job program is embedded into the Kylin's
>>> executable context.
>>>
>>> If we could have this listener, it would be easier to integrate with
>>> Kylin.
>>>
>>> Best,
>>> Vino
>>>
>>> Jeff Zhang <zjf...@gmail.com> 于2019年4月18日周四 下午1:30写道：
>>>
>>>>
>>>> Hi All,
>>>>
>>>> I created FLINK-12214
>>>> <https://issues.apache.org/jira/browse/FLINK-12214> for adding
>>>> JobListener (hook) in flink job lifecycle. Since this is a new public api
>>>> for flink, so I'd like to discuss it more widely in community to get more
>>>> feedback.
>>>>
>>>> The background and motivation is that I am integrating flink into apache
>>>> zeppelin <http://zeppelin.apache.org/>(which is a notebook in case you
>>>> don't know). And I'd like to capture some job context (like jobId) in the
>>>> lifecycle of flink job (submission, executed, cancelled) so that I can
>>>> manipulate job in more fined grained control (e.g. I can capture the jobId
>>>> when job is submitted, and then associate it with one paragraph, and when
>>>> user click the cancel button, I can call the flink cancel api to cancel
>>>> this job)
>>>>
>>>> I believe other projects which integrate flink would need similar
>>>> mechanism. I plan to add api addJobListener in
>>>> ExecutionEnvironment/StreamExecutionEnvironment so that user can add
>>>> customized hook in flink job lifecycle.
>>>>
>>>> Here's draft interface JobListener.
>>>>
>>>> public interface JobListener {
>>>>
>>>> void onJobSubmitted(JobID jobId);
>>>>
>>>> void onJobExecuted(JobExecutionResult jobResult);
>>>>
>>>> void onJobCanceled(JobID jobId, String savepointPath);
>>>> }
>>>>
>>>> Let me know your comment and concern, thanks.
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

Reply via email to