Re: Launching a Portable Pipeline

Ismaël Mejía Wed, 23 May 2018 14:48:58 -0700

Interesting document, two questions:

1. Why JobService is runner specific? Couldn't at least a good part of it
be reused given that the runner specific parts are mostly in the
translation? or I am missing other reasons?


2. What about authentication and authorisation for production runners ?
Once you can use such service to submit/cancel Pipelines is the first thing
I can think of abusing.
On Tue, May 22, 2018 at 9:40 PM Ankur Goenka <[email protected]> wrote:

> Thank you guys for the input.

> Here is the summary.

> Responsibility of Beam on Job Management

> Beam provide a common interface for basic job management operations
called JobService. The supported operations can vary between runners.


> What is JobService?

> JobService is a runner specific component which implements Beams
JobService interface defined here.


> What is the life cycle of a JobService?

> There are 3 scenarios

> With ULR, JobService is short lived and runs as long as the ULR runs. (
JobService Lifespan ~= Job Lifespan )

> With Production runners ( Flink, Dataflow etc), JobService can either be
short lived or long lived. The choice is up to the runner.

> With Production runners ( Flink, Dataflow etc) without long running
JobService, SDK will spin up a local JobService.


> JobService state management

> The choice of state management is up to JobService implementation. The
basic requirement is that JobService should be able to perform all the
operations with the returned job handle.

> At the very least it can be the job handle for the underlying runner job
and JobService will simply proxy actions to the runner using the provided
job handle.

> A persistent JobService is free to provide a simple string as a
JobHandle. In this case, job handle can only be used with the same job
service.

> A stateless not persistent JobService can provide a opaque blob
containing all the relevant information about the job. In this case the job
handle can be used with any instance of JobService with the same code.


> JobService code distribution and invocation when JobService is short lived

> We will give an easy to run solution using docker. Docker will help in
both executable distribution and providing platform independent binary.

> We will also give an easy setup script with a supporting document for
users who do not want to use docker on local machine.


> Should Flink JobService start a local cluster for testing?

> Flink JobService will be capable of submitting to a remote Flink cluster
if an master url is provided else it will execute the pipeline in an
inprocess Flink invocation on the same JVM.




> On Tue, May 22, 2018 at 12:37 PM Eugene Kirpichov <[email protected]>
wrote:

>> Thanks Ankur, I think there's consensus, so it's probably ready to share
:)

>> On Fri, May 18, 2018 at 3:00 PM Ankur Goenka <[email protected]> wrote:

>>> Thanks for all the input.
>>> I have summarized the discussions at the bottom of the document ( here
).
>>> Please feel free to provide comments.
>>> Once we agree, I will publish the conclusion on the mailing list.

>>> On Mon, May 14, 2018 at 1:51 PM Eugene Kirpichov <[email protected]>
wrote:

>>>> Thanks Ankur, this document clarifies a few points and raises some
very important questions. I encourage everybody with a stake in Portability
to take a look and chime in.

>>>> +Aljoscha Krettek +Thomas Weise +Henning Rohde

>>>> On Mon, May 14, 2018 at 12:34 PM Ankur Goenka <[email protected]>
wrote:

>>>>> Updated link to the document as the previous link was not working for
some people.


>>>>> On Fri, May 11, 2018 at 7:56 PM Ankur Goenka <[email protected]>
wrote:

>>>>>> Hi,

>>>>>> Recent effort on portability has introduced JobService and
ArtifactService to the beam stack along with SDK. This has open up a few
questions around how we start a pipeline in a portable setup (with
JobService).
>>>>>> I am trying to document our approach to launching a portable
pipeline and take binding decisions based on the discussion.
>>>>>> Please review the document and provide your feedback.

>>>>>> Thanks,
>>>>>> Ankur

Re: Launching a Portable Pipeline

Reply via email to