Re: Tracking what works with portability

Henning Rohde Fri, 11 May 2018 12:47:54 -0700

> For runners*SDK pairs that don't have a batch/streaming distinction how
about collapsing the columns?


There is also often a difference in whether we've actually tried them or
whether there are regression tests. Once we have a clearer (= greener and
bluer) picture, I'm fine with collapsing some columns. But, for now, I'd
like to see how it plays out.

Henning


On Fri, May 11, 2018 at 12:16 PM Henning Rohde <[email protected]> wrote:

> > Yea so I guess the column is more just "what works?" and not "what
> works with portability?"
>
> Yeah - the Direct runner column is just "what works". It's included,
> because direct runners are still relevant in the portable world and it's
> useful to see what is supported there in comparison with the portable
> runners. I clarified the caption.
>
> Henning
>
> On Fri, May 11, 2018 at 12:12 PM Kenneth Knowles <[email protected]> wrote:
>
>> On Fri, May 11, 2018 at 11:46 AM Lukasz Cwik <[email protected]> wrote:
>>
>>>
>>> On Fri, May 11, 2018 at 11:40 AM Kenneth Knowles <[email protected]> wrote:
>>>
>>>> This is great. "The Beam Vision in a spreadsheet" and/or what the
>>>> capability matrix wishes it always had been.
>>>>
>>>>  - I don't know how to interpret the DirectRunner column. Is it that it
>>>> uses ye olde proto round trip? Another level is that it actually directly
>>>> links in the SDK harness as a dep and uses the exact code paths (seems like
>>>> overkill).
>>>>
>>>>
>>> Its up to the direct runner here to decide what level of execution is
>>> actually done via portability APIs but it is meant to be a single process
>>> to ease debugging for users.
>>>
>>
>> Yea so I guess the column is more just "what works?" and not "what works
>> with portability?" in this case. Just a clarification - either way is fine
>> by me. I wasn't sure if the column was to track progress on making the
>> direct runners respect the model or whatnot. Without a proto round trip, a
>> DirectRunner can easily have non-model behaviors by using information that
>> it shouldn't.
>>
>>  - For runners*SDK pairs that don't have a batch/streaming distinction
>>>> how about collapsing the columns?
>>>>
>>>>
>>> Runners may not have a distinction but the portability framework may
>>> require more work from a runner to support a use case. A good example of
>>> this is side input readiness checking for streaming pipelines.
>>>
>>
>> What do you mean the portability framework? Do you mean an SDK harness?
>> Or that the protos do not express enough information?
>>
>> Kenn
>>
>>
>>  - Anyone have spreadsheet-fu to do a permanent global automatic
>>>> hyperlinking of BEAM-xxxx?
>>>>
>>>> Kenn
>>>>
>>>> On Fri, May 11, 2018 at 10:38 AM Henning Rohde <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>>  While the portability framework moves forward, it is often hard to
>>>>> figure out exactly what is supported to work at any given time. There
>>>>> are still many irregularities, TODOs, bugs and small differences between
>>>>> batch and streaming and the portable SDK and runner implementations.
>>>>> For example, the answer to the question "Does Wordcount run
>>>>> portably?" depends on the SDK, Runner and where the output is written.
>>>>>
>>>>> To this end, I've started a spreadsheet to better track the "swiss
>>>>> cheese" of what works portably:
>>>>>
>>>>>
>>>>> https://docs.google.com/spreadsheets/d/1KDa_FGn1ShjomGd-UUDOhuh2q73de2tPz6BqHpzqvNI/edit?usp=sharing
>>>>>
>>>>> Note that is is a work in progress. The intended audience is for
>>>>> everyone working on or interested in portability. I am hoping we can
>>>>> populate, expand and maintain the information as a community, until the
>>>>> portability framework support is mature enough to allow SDKs and runners 
>>>>> to
>>>>> be considered independently.
>>>>>
>>>>> Comments and suggestions welcome!
>>>>>
>>>>> Thanks,
>>>>>  Henning
>>>>>
>>>>>
>>>>>
>>>>>

Re: Tracking what works with portability

Reply via email to