Hi AJ, sorry for not getting back to you earlier, I was too busy and only read your mail now.
Adding the context is not a simple optimization of the case you described. In your case, you will get 30-second windows where the elements are assigned to those windows based on their timestamp. If you have one big daily window and do 30-second speculative (early) firing of that window based on processing time you will at each firing possibly have elements over that complete day in the window, i.e. you progressively output a more refined result for that 1-day window. Does that make sense to you? Cheers, Aljoscha On Wed, 14 Sep 2016 at 18:21 AJ Heller <a...@drfloob.com> wrote: > Thank you, Aljoscha! I look forward to reading the papers you mentioned. > > Regarding FLIP-2, are there any new use cases that a Window Function > Context enables? If not, my understanding is that adding a this context > would be an optimization over what is currently possible, but maybe > inefficient. For example of how I think this would work, instead of a > "firing reason" context to let you differentiate between (e.g.) > every-30-second early firings and a daily primary firing, I imagine you > could split the stream, where one exclusively emits 30 second aggregates > and the other exclusively emits daily, and deal with them separately. > > If that is the case, that it amounts to an optimization: have you > considered wheter the added complexity is worth the potential efficiency > gain? Otherwise, if it amounts to more than a small optimization, I'd be > very interested to understand what this change would enable, I currently > don't see it. I am under time pressure to choose a viable project (the idea > was to be solidified yesterday, actually), and I would very much like to > work on this now if I can justify it. If not, I would still very much like > to work on this, but the timing will have to be different. > > Again, thank you Aljoscha, and I apologize for the rushed nature of my > situation. > > Best, > -aj heller > > On Wed, Sep 14, 2016 at 1:19 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > Hi AJ, > > the idea for evictors initially came from IBM Infosphere Streams, if I'm > > not mistaken: > > http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/ > > com.ibm.streams.dev.doc/doc/windowhandling.html > > The > > first version of the windowing system used a combination of > > triggers/evictors to do the windowing, this is describe in Jonas Traub's > > thesis: http://www.diva-portal.se/smash/get/diva2:861798/FULLTEXT01.pdf. > > > > I'm quite skeptical about having support for Evictors in the first place. > > They make computation inefficient because you always have to keep a list > of > > all elements and cannot incrementally aggregate using a reduce function. > > Also, it is quite tricky to figure out how to do eviction based on > > ProcessingTime with a good interface. If you have some ideas how this > could > > be improved I'm open to anything. > > > > For now, I would suggest to focus on FLIP-2, since quite a number of > people > > would be interested in having that. I would also not put any energy in > > trying to figure out how the context can be shared between evictors and > > other parts of the system. If we keep evictors I would like to keep the > API > > and implementation completely separate from anything else that's going on > > in the system. > > > > On implementation, the context would probably created by the > WindowOperator > > or by the InternalWindowFunction. > > > > Cheers, > > Aljoscha > > > > On Mon, 12 Sep 2016 at 08:27 AJ Heller <a...@drfloob.com> wrote: > > > > > Could you point me towards the inspiration for Evictors? Are there any > > > papers, perhaps, that lay the groundwork for mutable windows like this? > > > > > > After much research this weekend, I found that Evictors are unique to > > > Flink. Conceptually, it looks to me like Dataflow windows are > build-only. > > > Looking into other Dataflow implementations: I didn't find anything in > > > either the Apache Beam SDK docs or the Google Cloud Dataflow API docs > > that > > > mention allowing you to remove elements from a window. I'm hesitant to > > > tread new ground in mutability. > > > > > > What do you think about reimplementing Evictors as a kind of cyclic > > filter > > > operation? Would it be possible? I believe this would fit into the > > Dataflow > > > model better, but I'm still in the early stages of becoming familiar > with > > > Flink, and I haven't read the ABS paper [1] yet to know if there are > > > snapshot implications. I also don't (yet) see why you couldn't optimize > > > such a cyclic operation with mutable operations under the hood. > > > > > > [1]: http://arxiv.org/abs/1506.08603 > > > > > > > > > On Fri, Sep 9, 2016 at 11:46 AM, AJ Heller <a...@drfloob.com> wrote: > > > > > >> Thank you for offering your support, I'm excited to dig in! > > >> > > >> I have some work to do getting up to speed on the windowing internals. > > >> And I still need to get my bearing on the Evictor changes, I plan to > > read > > >> through the list archive and documents today. Vishnu, are your changes > > >> already publicly viewable? > > >> > > >> Regarding the window modifications in FLIP-2, I see Vishnu that you've > > >> suggested an interface for the EvictorContext object, and Aljoscha, > you > > >> suggested an abstract Context class. Does it make sense for them to > > agree? > > >> The other big difference I've seen in the signatures is wheter the > > Window > > >> is contained in the context or not. > > >> > > >> Have you considered modifying the signature of the methods to accept > `<C > > >> extends Context>` or `<EC extends EvictorContext>`? At least in terms > of > > >> FLIP-2, this would allow each process window function to define and > work > > >> with its own context (without downcasting, anyway), and similarly in > the > > >> future, there'd be less work in changing Context subclasses when new > > >> abstract methods are added to Context. > > >> > > >> But I may be getting ahead of myself. Could you point me towards where > > >> contexts are/would be created? I'm not clear on the ownership and > > lifecycle > > >> of these objects yet. > > >> > > > > > > > > >