Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Pei HE
+1 On Sat, Apr 22, 2017 at 12:16 PM, Jean-Baptiste Onofré wrote: > +1 > > Regards > JB > > > On 04/21/2017 07:24 PM, Kenneth Knowles wrote: > >> Hi all, >> >> I propose that we remove KeyedCombineFn before the first stable release. >> >> I don't think it adds enough value for the complexity it a

Hanging Jenkins builds.

2017-04-21 Thread Aviem Zur
Hi all, Please be aware that Beam builds (precommit + postcommit validations) are hanging since a few hours ago. This seems to be a problem in builds of other projects as well (for example, Kafka). I've opened an INFRA ticket: https://issues.apache.org/jira/browse/INFRA-13949

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Jean-Baptiste Onofré
+1 Regards JB On 04/21/2017 07:24 PM, Kenneth Knowles wrote: Hi all, I propose that we remove KeyedCombineFn before the first stable release. I don't think it adds enough value for the complexity it adds to e.g. CombineWithContext [1] and state [2, 3], and it doesn't seem to me that users rea

Help needed with WordCount commands

2017-04-21 Thread Hadar Hod
Hi everyone! Your help is needed with the WordCount [1] documentation! I'm a technical writer, continuing to work on Beam docs. In getting the site ready for the 2.0 release, I added a “How to run” section to each of the examples in the WordCount doc (staged here [2]). To be more exact, I added a

Re: Towards a spec for robust streaming SQL, Part 1

2017-04-21 Thread Tyler Akidau
Good point, when you start talking about anything less than a full join, triggers get involved to describe how one actually achieves the desired semantics, and they may end up being tied to just one of the inputs (e.g., you may only care about the watermark for one side of the join). Am expecting u

Re: AfterWatermarkEarlyAndLate

2017-04-21 Thread Kenneth Knowles
All of your plans sound good. Private constructors, yes. Just one Trigger subclass, yes. I don't have strong feelings about the naming here. It could be just AfterWatermark or AfterWatermark.PastEndOfWindow. The conversion from Trigger to TriggerStateMachine goes through beam_runner_api.proto whi

Re: Towards a spec for robust streaming SQL, Part 1

2017-04-21 Thread Kenneth Knowles
There's something to be said about having different triggering depending on which side of a join data comes from, perhaps? (delightful doc, as usual) Kenn On Fri, Apr 21, 2017 at 1:33 PM, Tyler Akidau wrote: > Thanks for reading, Luke. The simple answer is that CoGBK is basically > flatten + G

Re: Towards a spec for robust streaming SQL, Part 1

2017-04-21 Thread Tyler Akidau
Thanks for reading, Luke. The simple answer is that CoGBK is basically flatten + GBK. Flatten is a non-grouping operation that merges the input streams into a single output stream. GBK then groups the data within that single union stream as you might otherwise expect, yielding a single table. So I

Re: Towards a spec for robust streaming SQL, Part 1

2017-04-21 Thread Lukasz Cwik
The doc is a good read. I think you do a great job of explaining table -> stream, stream -> stream, and stream -> table when there is only one stream. But when there are multiple streams reading/writing to a table, how does that impact what occurs? For example, with CoGBK you have multiple streams

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Aljoscha Krettek
+1, as I’m almost always in favour of simplification > On 21. Apr 2017, at 19:59, Robert Bradshaw > wrote: > > Strongly in favor of removing this. If it's actually needed one can > incorporate the key into the value for inspection in the various > phases of the CombineFn, so it's no loss of ex

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Robert Bradshaw
Strongly in favor of removing this. If it's actually needed one can incorporate the key into the value for inspection in the various phases of the CombineFn, so it's no loss of expressiveness. It's perfectly reasonable to make this (rare) usecase more complicated to greatly simplify the common API.

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Sourabh Bajaj
+1 On Fri, Apr 21, 2017 at 10:53 AM Thomas Groh wrote: > A happy +1. This simplifies the code base, and if we find a compelling use, > it shouldn't be too bad to add it back in. > > On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles > wrote: > > > Hi all, > > > > I propose that we remove KeyedCo

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Thomas Groh
A happy +1. This simplifies the code base, and if we find a compelling use, it shouldn't be too bad to add it back in. On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles wrote: > Hi all, > > I propose that we remove KeyedCombineFn before the first stable release. > > I don't think it adds enough

Re: [PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Davor Bonaci
+1 -- this is a good simplification. On Fri, Apr 21, 2017 at 10:24 AM, Kenneth Knowles wrote: > Hi all, > > I propose that we remove KeyedCombineFn before the first stable release. > > I don't think it adds enough value for the complexity it adds to e.g. > CombineWithContext [1] and state [2, 3]

[PROPOSAL] Remove KeyedCombineFn

2017-04-21 Thread Kenneth Knowles
Hi all, I propose that we remove KeyedCombineFn before the first stable release. I don't think it adds enough value for the complexity it adds to e.g. CombineWithContext [1] and state [2, 3], and it doesn't seem to me that users really use it when we might expect. I am happy to be demonstrated wr

Re: Can application specify how watermarks should be generated?

2017-04-21 Thread Shen Li
Hi, A follow-up question. I found that the getWatermark() API is only available for UnboundedSource. BoundedSource provides a getCurrentTimestamp() API with comments "By default, returns the minimum possible timestamp", which sounds like a watermark. Any reason for the difference in method names?