Jenkins build is back to normal : beam_Release_NightlySnapshot #323

2017-02-08 Thread Apache Jenkins Server
See

Re: How does SideInputHandler work?

2017-02-08 Thread Kenneth Knowles
Hi Shen, Yes, this is how some existing runners do it. Here is one example: https://github.com/apache/beam/blob/master/runners/apex/src/main/java/org/apache/beam/runners/apex/ApexRunner.java#L319 Kenn On Tue, Feb 7, 2017 at 3:33 PM, Shen Li wrote: > Hi Kenn, > > Thanks for explaining. > > What

Re: Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #2578

2017-02-08 Thread Kenneth Knowles
This cleared up; Jenkins worker transient failure? On Wed, Feb 8, 2017 at 4:57 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See MavenInstall/2578/changes> > > Changes: > > [tgroh] Add Input Reconstruction to PTransformOverr

Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Dataflow #2241

2017-02-08 Thread Kenneth Knowles
This is on a PR, not postcommit. On Wed, Feb 8, 2017 at 7:35 PM, Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See RunnableOnService_Dataflow/2241/> > >

Re: [RESULT] [VOTE] Apache Beam, version 0.5.0, release candidate #2

2017-02-08 Thread Davor Bonaci
This release is now complete. Thanks to everyone who have helped make this release possible! On Mon, Feb 6, 2017 at 8:36 AM, Davor Bonaci wrote: > I'm happy to announce that we have unanimously approved this release. > > There are 5 approving votes, 4 of which are binding: > * Davor Bonaci > *

Re: Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Dataflow #2238

2017-02-08 Thread Eugene Kirpichov
Please nevermind this build. This is a one-shot I triggered manually for https://github.com/apache/beam/pull/1898. On Wed, Feb 8, 2017 at 2:55 PM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See < > https://builds.apache.org/job/beam_PostCommit_Java_RunnableOnService_Dataflow/2238/

Re: Should you always have a separate PTransform class for a new transform?

2017-02-08 Thread Eugene Kirpichov
I think the value in having Mean.perKey() in addition to Mean.combineFn() is that using Mean.perKey() does not require knowledge of the combine concept, so easier for users. Generally, when using Beam, to simply compute a count or a mean, you should not need to know about combine. On Wed, Feb 8, 2

Re: Should you always have a separate PTransform class for a new transform?

2017-02-08 Thread Robert Bradshaw
On Wed, Feb 8, 2017 at 1:27 PM, Eugene Kirpichov < kirpic...@google.com.invalid> wrote: > So... Would it be fair to say that everybody would be satisfied if we > treated the "glorified combine" transforms (Sum, Count, Mean, Sample, > Latest) the following way: > - For each case, SDK must expose th

Re: Should you always have a separate PTransform class for a new transform?

2017-02-08 Thread Eugene Kirpichov
So... Would it be fair to say that everybody would be satisfied if we treated the "glorified combine" transforms (Sum, Count, Mean, Sample, Latest) the following way: - For each case, SDK must expose the relevant CombineFn as a static factory function: e.g. Sum.ofIntegers(), Latest.of(), etc. [it m

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Xu Mingmin
checking the interface of 0.8.2, 0.9.0, 0.10.0. Eventually the type is unified as Collection<> from mixing vararg, List<>, Collection<>, On Wed, Feb 8, 2017 at 12:33 PM, Raghu Angadi wrote: > On Wed, Feb 8, 2017 at 12:19 PM, Jesse Anderson > wrote: > > > I'm not. There was a decent amount of ti

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Raghu Angadi
On Wed, Feb 8, 2017 at 12:19 PM, Jesse Anderson wrote: > I'm not. There was a decent amount of time between the first 0.8 and 0.9 > release. > The ones that affect are minor changes between 0.9 and 0.10 (e.g. change vararg to Collection<>). May be both could have existed with older one marked de

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Jesse Anderson
I'm not. There was a decent amount of time between the first 0.8 and 0.9 release. On Wed, Feb 8, 2017, 12:08 PM Raghu Angadi wrote: > True. > > I was commenting on Kafka developers. I am surprised the api breakages > didn't have any deprecation period at all. > > On Wed, Feb 8, 2017 at 12:02 PM,

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Raghu Angadi
True. I was commenting on Kafka developers. I am surprised the api breakages didn't have any deprecation period at all. On Wed, Feb 8, 2017 at 12:02 PM, Xu Mingmin wrote: > i tend to have more versions supported, actually in our prod environment, > there're 0.8, 0.9 and 0.10 for different team

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Xu Mingmin
i tend to have more versions supported, actually in our prod environment, there're 0.8, 0.9 and 0.10 for different teams. we'd take care of users who are on old versions. On Wed, Feb 8, 2017 at 10:56 AM, Raghu Angadi wrote: > If we let the user pick their kafka version in their dependencies, s

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Raghu Angadi
If we let the user pick their kafka version in their dependencies, simplest fix is to broaden KafkaIO kafka-client dependency to something like [0.9.1, 0.11) (and handle the api incompatibility at runtime). It might not be long before we could drop 0.9 support. Looking at these api changes in Kafk

Re: Report to the Board, February 2017 edition

2017-02-08 Thread Davor Bonaci
Thanks everyone -- the report has been posted. On Tue, Feb 7, 2017 at 4:05 AM, Jean-Baptiste Onofré wrote: > Hi > > It looks good to me. > > Thanks Davor > Regards > JB > > On Feb 6, 2017, 19:32, at 19:32, Davor Bonaci wrote: > >We are expected to submit a project report to the ASF Board of > >

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-08 Thread Kenneth Knowles
I recommend proceeding with the runner-facing state & timer APIs; they are lower-level and more appropriate for this. All runners provide them or use runners/core implementations, as they are needed for triggering. On Wed, Feb 8, 2017 at 10:34 AM, Eugene Kirpichov wrote: > Thanks Aljoscha! > > M

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-02-08 Thread Kenneth Knowles
Hi Etienne, If the timer is firing n times for n elements, that's a bug in the runner / shared runner code. It should be deduped. Which runner? Can you file a JIRA against me to investigate? I'm still in the process of fleshing out more and more RunnableOnService (aka ValidatesRunner) tests so I w

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-08 Thread Eugene Kirpichov
Thanks Aljoscha! Minor note: I'm not familiar with what level of support for timers Flink currently has - however SDF in Direct and Dataflow runner currently does not use the user-facing state/timer APIs - rather, it uses the runner-facing APIs (StateInternals and TimerInternals) - perhaps Flink a

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Jean-Baptiste Onofré
We can always have kafka-common gathering code shared by kafka-9 and kafka-10. Then each module can have different dependencies, release cycle, etc. Regards JB On Feb 8, 2017, 14:09, at 14:09, Stephen Sisk wrote: >hi JB! > >Can you explain what you mean by easier to maintain? Is that because of

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Raghu Angadi
What is the recommended way for users to bundle their app? The fix could as simple as letting the user set version in mvn property ('kafka.client.version'). The api incompatibility blocking Xu is small enough (just seekToEnd() call) that it can be handled at runtime with reflection. It is called on

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Xu Mingmin
I've a proposal design doc here https://docs.google.com/document/d/1YlCWws4SYCqUWAtVz9mrmVdFNM8b3E8DLCImYep-I6k/edit# which try to share the same code of KafkaIO. Due to the API change of Kafka Consumer.class, it's nearly impossible to have a single module for both kafka 0.9 and 0.10, unless reflec

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-08 Thread Aljoscha Krettek
Thanks for the motivation, Eugene! :-) I've wanted to do this for a while now but was waiting for the Flink 1.2 release (which happened this week)! There's some prerequisite work to be done on the Flink runner: we'll move to the new timer interfaces introduced in Flink 1.2 and implement support fo

Re: BEAM-307(KafkaIO on Kafka 0.10)

2017-02-08 Thread Stephen Sisk
hi JB! Can you explain what you mean by easier to maintain? Is that because of maven dependency management, or some other factor? If we need to go with 2 modules for dependency reasons, is there a way to share code between those two modules so that we don't have 2 copies of the code? (or is the co

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-08 Thread Eugene Kirpichov
Thanks! Looking forward to this work. On Wed, Feb 8, 2017 at 3:50 AM Jean-Baptiste Onofré wrote: > Thanks for the update Eugene. > > I will work on the spark runner with Amit. > > Regards > JB > > On Feb 7, 2017, 19:12, at 19:12, Eugene Kirpichov > wrote: > >Hello, > > > >I'm almost done adding

Re: [DISCUSS] Beam data plane serialization tech

2017-02-08 Thread vikas rk
+1. You mean sharing definitions between the fn API and runner API? Like the idea. -Vikas On 7 February 2017 at 14:39, Kenneth Knowles wrote: > This has lain dormant as I was drawn off to other things. But now I'm > looping back on this so there are no surprises in my upcoming (third) > revisi

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-02-08 Thread Jean-Baptiste Onofré
Hi AFAIR the timer per function is in the "roadmap" (remembering discussion we had with Kenn). I will take a deeper look next week on your branch. Regards JB On Feb 8, 2017, 13:28, at 13:28, Etienne Chauchot wrote: >Hi Kenn, > >I have started using state and timer APIs, they seem awesome! > >

Re: [BEAM-135] Utilities for "batching" elements in a DoFn

2017-02-08 Thread Etienne Chauchot
Hi Kenn, I have started using state and timer APIs, they seem awesome! Please take a look at https://github.com/echauchot/beam/tree/BEAM-135-BATCHING-PARDO It contains a PTransform that does the batching trans-bundles and respecting the windows (even if tests are not finished yet, see @Ignor

Re: Let's make Beam transforms comply with PTransform Style Guide

2017-02-08 Thread Jean-Baptiste Onofré
Thanks Eugene. I will tackle some Jira when back next week. Regards JB On Feb 7, 2017, 18:16, at 18:16, Eugene Kirpichov wrote: >Hey all, > >I bit the bullet and audited all PTransform classes in Beam Java SDK >and >filed JIRA issues for all violations I could find. >I linked all them to the m

Re: Call for help: let's add Splittable DoFn to Spark, Flink and Apex runners

2017-02-08 Thread Jean-Baptiste Onofré
Thanks for the update Eugene. I will work on the spark runner with Amit. Regards JB On Feb 7, 2017, 19:12, at 19:12, Eugene Kirpichov wrote: >Hello, > >I'm almost done adding support for Splittable DoFn >http://s.apache.org/splittable-do-fn to Dataflow streaming runner*, and >very excited abou

Fwd: Build failed in Jenkins: beam_PostCommit_Java_RunnableOnService_Spark #882

2017-02-08 Thread Amit Sela
Looks like PostCommit builds are broken for some runners, I checked out Spark, Flink and Apex. This started after this commit, and causes a "NoClassDefFoundError com/fasterxml/jackson/dataformat/yaml/YAMLFactory". Op