Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Kenneth Knowles
Yea, I think it could. But it is probably more readable to not overload the term, plus certainly a bit simpler in implementation. So perhaps @AsyncElement to make it very clear. Kenn On Sat, Mar 10, 2018 at 1:32 PM Reuven Lax wrote: > Ken, can NewDoFn distinguish at generation time the differen

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Reuven Lax
Ken, can NewDoFn distinguish at generation time the difference between: public void process(@Element CompletionStage element, ...) { and public void process(@Element Input element, ...) { If not, then we would probably need separate annotations On Sat, Mar 10, 2018 at 11:09 AM

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Romain Manni-Bucau
Le 10 mars 2018 20:09, "Kenneth Knowles" a écrit : Nice! I agree that providing a CompletionStage for chaining is much better than an ExecutorService, and very clear. It is very feasible to add support that looks like new DoFn() { @ProcessElement public void process(@Element Completio

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Kenneth Knowles
Nice! I agree that providing a CompletionStage for chaining is much better than an ExecutorService, and very clear. It is very feasible to add support that looks like new DoFn() { @ProcessElement public void process(@Element CompletionStage element, ...) { element.thenApply(...)

Re: The Go SDK got accidentally merged - options to deal with the pain

2018-03-10 Thread Henning Rohde
Thank you all! I've added the remaining work -- as I understand it -- as dependencies to the overall Go SDK issue (tracking the "official" merge to master): https://issues.apache.org/jira/browse/BEAM-2083 Please feel free to add to this list or expand the items, if there is anything I overloo

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Romain Manni-Bucau
2018-03-10 17:30 GMT+01:00 Reuven Lax : > Have you considered drafting in detail what you think this API might look > like? > Yes, but it is after the "enhancements" - for my use cases - and "bugs" list so didn't started to work on it much. > > If it's a radically different API, it might be mo

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Reuven Lax
Have you considered drafting in detail what you think this API might look like? If it's a radically different API, it might be more appropriate as an alternative parallel Beam API rather than a replacement for the current API (there is also one such fluent API in the works). On Sat, Mar 10, 2018

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Romain Manni-Bucau
2018-03-10 16:19 GMT+01:00 Reuven Lax : > This is another version (maybe a better, Java 8 idiomatic one?) of what > Kenn suggested. > > Note that with NewDoFn this need not be incompatible (so might not require > waiting till Beam 3.0). We can recognize new parameters to processElement > and popul

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Reuven Lax
This is another version (maybe a better, Java 8 idiomatic one?) of what Kenn suggested. Note that with NewDoFn this need not be incompatible (so might not require waiting till Beam 3.0). We can recognize new parameters to processElement and populate add needed. On Sat, Mar 10, 2018, 12:13 PM Roma

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Romain Manni-Bucau
Yes, for the dofn for instance, instead of having processcontext.element()= you get a CompletionStage and output gets it as well. This way you register an execution chain. Mixed with streams you get a big data java 8/9/10 API which enabkes any connectivity in a wel performing manner ;). Le 10 mar

Re: Advice on parallelizing network calls in DoFn

2018-03-10 Thread Reuven Lax
So you mean the user should have a way of registering asynchronous activity with a callback (the callback must be registered with Beam, because Beam needs to know not to mark the element as done until all associated callbacks have completed). I think that's basically what Kenn was suggesting, unles

Re: [PROPOSITION] schedule some sanity tests on a daily basis

2018-03-10 Thread Łukasz Gajowy
> > - Integration tests: AFAIK we only run the ones in examples module and > only on demand. What about running all the IT (in > particular IO IT) as a cron job on a daily basis with direct runner? > Please note that it will require some always up > backend infrastructure. > Running IOITs on Direc