Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-09-09 Thread Guozhang Wang
Hello Matthias, Thanks for your thoughts! On Mon, Sep 9, 2019 at 6:02 PM Matthias J. Sax wrote: > From my point of view, a Tumbling/Hopping window has different semantics > than a Sliding-Window, and hence, I am not convinced atm that it's a > good idea to use 1ms-hopping-windows. > > > > (1) I

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-09-09 Thread Matthias J. Sax
From my point of view, a Tumbling/Hopping window has different semantics than a Sliding-Window, and hence, I am not convinced atm that it's a good idea to use 1ms-hopping-windows. (1) I think that the window bounds are different, ie, while a time-window hast the lower-start-time as an inclusive

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-09-09 Thread Guozhang Wang
Hello John, I like your idea of adding a new Combinator interface better! In addition to your arguments, we can also leverage on each overloaded function that users supplies for different aggregation implementation (i.e. if combinator is provided we can do window-slicing, otherwise we follow the c

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-09-06 Thread John Roesler
Thanks for this idea, Guozhang, it does seem to be a nice way to solve the problem. I'm a _little_ concerned about the interface, though. It might be better to just add a new argument to a new method overload like `(initializer, aggregator, merger/combinator/whatever)`. Two reasons come to mind f

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-09-04 Thread Guozhang Wang
Hi folks, I've been thinking more about this KIP and my understanding is that we want to introduce a new SlidingWindow notion for aggregation since our current TimeWindow aggregation is not very efficient with very small steps. So I'm wondering that rather than introducing a new implementation mec

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-16 Thread Matthias J. Sax
Thanks Sophie! Regarding (4), I am in favor to support both. Not sure if we can reuse existing window store (with enabling to store duplicates) for this case or not though, or if we need to design a new store to keep all raw records? Btw: for holistic aggregations, like media, we would need to s

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-16 Thread Guozhang Wang
Regarding 4): yes I agree with you that invertibility is not a common property for agg-functions. Just to be clear about our current APIs: for stream.aggregate we only require a single Adder function, whereas for table.aggregate we require both Adder and Subtractor, but these are not used to levera

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-15 Thread Sophie Blee-Goldman
Thanks for the feedback Matthias and Bill. After discussing offline we realized the type of windows I originally had in mind were quite different, and I agree now that the semantics outlined by Matthias are the direction to go in here. I will update the KIP accordingly with the new semantics (and c

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-12 Thread Bill Bejeck
Thanks for the KIP Sophie. I have a couple of additional comments. The current proposal only considers stream-time. While I support this, each time we introduce a new operation based on stream-time, invariably users request that operation support wall-clock time as well. Would we want to consid

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-12 Thread Guozhang Wang
On Thu, Apr 11, 2019 at 2:10 PM Sophie Blee-Goldman wrote: > Thanks for the comments Guozhang! I've answered your questions below > > On Tue, Apr 9, 2019 at 4:38 PM Guozhang Wang wrote: > > > Hi Sophie, > > > > Thanks for the proposed KIP. I've made a pass over it and here are some > > thoughts:

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-11 Thread Sophie Blee-Goldman
Thanks for the comments Guozhang! I've answered your questions below On Tue, Apr 9, 2019 at 4:38 PM Guozhang Wang wrote: > Hi Sophie, > > Thanks for the proposed KIP. I've made a pass over it and here are some > thoughts: > > 1. "The window size is effectively the grace and retention period". Th

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-11 Thread Matthias J. Sax
Thanks for the KIP Sophie. Couple of comments: It's a little unclear to me, what public API you propose. It seems you want to add > public class SlidingWindow extends TimeWindow {} and > public class SlidingWindows extends TimeWindows {} // or maybe `extends > Windows` If yes, should we add c

Re: [DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-09 Thread Guozhang Wang
Hi Sophie, Thanks for the proposed KIP. I've made a pass over it and here are some thoughts: 1. "The window size is effectively the grace and retention period". The grace time is defined as "the time to admit late-arriving events after the end of the window." hence it is the additional time beyon

[DISCUSS] KIP-450: Sliding Window Aggregations in the DSL

2019-04-05 Thread Sophie Blee-Goldman
Hello all, I would like to kick off discussion of this KIP aimed at providing sliding window semantics to DSL aggregations. https://cwiki.apache.org/confluence/display/KAFKA/KIP-450%3A+Sliding+Window+Aggregations+in+the+DSL Please take a look and share any thoughts you have regarding the API, se