Re: Custom URNs and runner translation

2018-04-27 Thread Robert Bradshaw
On Fri, Apr 27, 2018 at 12:34 PM Kenneth Knowles wrote: > On Fri, Apr 27, 2018 at 12:18 PM Thomas Weise wrote: >> The ability to specify with URN and implement custom transforms is also important. Such transforms may not qualify for inclusion in Beam for a variety of reasons (only relevant for

Re: Custom URNs and runner translation

2018-04-27 Thread Kenneth Knowles
On Fri, Apr 27, 2018 at 12:18 PM Thomas Weise wrote: > > The ability to specify with URN and implement custom transforms is also > important. Such transforms may not qualify for inclusion in Beam for a > variety of reasons (only relevant for a specific environment or use case, > dependencies/lice

Re: Custom URNs and runner translation

2018-04-27 Thread Robert Bradshaw
On Fri, Apr 27, 2018 at 12:18 PM Thomas Weise wrote: > Thanks for all the feedback! I agree that the desirable state is to have solid connector implementations for all common integration scenarios as part of Beam. And it seems that the path there would be cross-language IO. > The ability to spec

Re: Custom URNs and runner translation

2018-04-27 Thread Thomas Weise
Thanks for all the feedback! I agree that the desirable state is to have solid connector implementations for all common integration scenarios as part of Beam. And it seems that the path there would be cross-language IO. The ability to specify with URN and implement custom transforms is also import

Re: Custom URNs and runner translation

2018-04-27 Thread Lukasz Cwik
On Thu, Apr 26, 2018 at 8:38 PM Chamikara Jayalath wrote: > > > On Thu, Apr 26, 2018 at 5:59 PM Eugene Kirpichov > wrote: > >> I agree with Thomas' sentiment that cross-language IO is very important >> because of how much work it takes to produce a mature connector >> implementation in a languag

Re: Custom URNs and runner translation

2018-04-26 Thread Chamikara Jayalath
On Thu, Apr 26, 2018 at 5:59 PM Eugene Kirpichov wrote: > I agree with Thomas' sentiment that cross-language IO is very important > because of how much work it takes to produce a mature connector > implementation in a language. Looking at implementations of BigQueryIO, > PubSubIO, KafkaIO, FileIO

Re: Custom URNs and runner translation

2018-04-26 Thread Eugene Kirpichov
I agree with Thomas' sentiment that cross-language IO is very important because of how much work it takes to produce a mature connector implementation in a language. Looking at implementations of BigQueryIO, PubSubIO, KafkaIO, FileIO in Java, only a very daring soul would be tempted to reimplement

Re: Custom URNs and runner translation

2018-04-25 Thread Kenneth Knowles
It doesn't have to be 1:1 swapping KafkaIO for a Flink Kafka connector, right? I was imagining: Python SDK submits pipeline with a KafkaIO (with URN + payload) maybe bogus contents. It is replaced with a small Flink subgraph, including the native Flink Kafka connector and some compensating transfom

Re: Custom URNs and runner translation

2018-04-25 Thread Chamikara Jayalath
On Wed, Apr 25, 2018 at 6:57 PM Reuven Lax wrote: > On Wed, Apr 25, 2018 at 6:51 PM Kenneth Knowles wrote: > >> The premise of URN + payload is that you can establish a spec. A native >> override still needs to meet the spec - it may still require some >> compensating code. Worrying about weird

Re: Custom URNs and runner translation

2018-04-25 Thread Reuven Lax
On Wed, Apr 25, 2018 at 6:51 PM Kenneth Knowles wrote: > The premise of URN + payload is that you can establish a spec. A native > override still needs to meet the spec - it may still require some > compensating code. Worrying about weird differences between runners seems > more about worrying th

Re: Custom URNs and runner translation

2018-04-25 Thread Kenneth Knowles
The premise of URN + payload is that you can establish a spec. A native override still needs to meet the spec - it may still require some compensating code. Worrying about weird differences between runners seems more about worrying that an adequate spec cannot be determined. Runners will already i

Re: Custom URNs and runner translation

2018-04-25 Thread Reuven Lax
On Tue, Apr 24, 2018 at 5:52 PM Chamikara Jayalath wrote: > > > On Tue, Apr 24, 2018 at 3:44 PM Henning Rohde wrote: > >> > Note that a KafkaDoFn still needs to be provided, but could be a DoFn >> that >> > fails loudly if it's actually called in the short term rather than a >> full >> > Python

Re: Custom URNs and runner translation

2018-04-24 Thread Chamikara Jayalath
On Tue, Apr 24, 2018 at 3:44 PM Henning Rohde wrote: > > Note that a KafkaDoFn still needs to be provided, but could be a DoFn > that > > fails loudly if it's actually called in the short term rather than a full > > Python implementation. > > For configurable runner-native IO, for now, I think it

Re: Custom URNs and runner translation

2018-04-24 Thread Henning Rohde
> Note that a KafkaDoFn still needs to be provided, but could be a DoFn that > fails loudly if it's actually called in the short term rather than a full > Python implementation. For configurable runner-native IO, for now, I think it is reasonable to use a URN + special data payload directly withou

Re: Custom URNs and runner translation

2018-04-24 Thread Robert Bradshaw
On Tue, Apr 24, 2018 at 1:14 PM Thomas Weise wrote: > Hi Cham, > Thanks for the feedback! > I should have probably clarified that my POC and questions aren't specific to Kafka as source, but pretty much any other source/sink that we internally use as well. We have existing Flink pipelines that

Re: Custom URNs and runner translation

2018-04-24 Thread Thomas Weise
Hi Cham, Thanks for the feedback! I should have probably clarified that my POC and questions aren't specific to Kafka as source, but pretty much any other source/sink that we internally use as well. We have existing Flink pipelines that are written in Java and we want to use the same connectors w

Re: Custom URNs and runner translation

2018-04-24 Thread Chamikara Jayalath
Hi Thomas, Seems like we are working on similar (partially) things :). On Tue, Apr 24, 2018 at 9:03 AM Thomas Weise wrote: > I'm working on a mini POC to enable Kafka as custom streaming source for a > Python pipeline executing on the (in-progress) portable Flink runner. > > We eventually want

Custom URNs and runner translation

2018-04-24 Thread Thomas Weise
I'm working on a mini POC to enable Kafka as custom streaming source for a Python pipeline executing on the (in-progress) portable Flink runner. We eventually want to use the same native Flink connectors for sources and sinks that we also use in other Flink jobs. I got a simple example to work wi