Thank you, Aljoscha! I look forward to reading the papers you mentioned. Regarding FLIP-2, are there any new use cases that a Window Function Context enables? If not, my understanding is that adding a this context would be an optimization over what is currently possible, but maybe inefficient. For example of how I think this would work, instead of a "firing reason" context to let you differentiate between (e.g.) every-30-second early firings and a daily primary firing, I imagine you could split the stream, where one exclusively emits 30 second aggregates and the other exclusively emits daily, and deal with them separately.
If that is the case, that it amounts to an optimization: have you considered wheter the added complexity is worth the potential efficiency gain? Otherwise, if it amounts to more than a small optimization, I'd be very interested to understand what this change would enable, I currently don't see it. I am under time pressure to choose a viable project (the idea was to be solidified yesterday, actually), and I would very much like to work on this now if I can justify it. If not, I would still very much like to work on this, but the timing will have to be different. Again, thank you Aljoscha, and I apologize for the rushed nature of my situation. Best, -aj heller On Wed, Sep 14, 2016 at 1:19 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi AJ, > the idea for evictors initially came from IBM Infosphere Streams, if I'm > not mistaken: > http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/ > com.ibm.streams.dev.doc/doc/windowhandling.html > The > first version of the windowing system used a combination of > triggers/evictors to do the windowing, this is describe in Jonas Traub's > thesis: http://www.diva-portal.se/smash/get/diva2:861798/FULLTEXT01.pdf. > > I'm quite skeptical about having support for Evictors in the first place. > They make computation inefficient because you always have to keep a list of > all elements and cannot incrementally aggregate using a reduce function. > Also, it is quite tricky to figure out how to do eviction based on > ProcessingTime with a good interface. If you have some ideas how this could > be improved I'm open to anything. > > For now, I would suggest to focus on FLIP-2, since quite a number of people > would be interested in having that. I would also not put any energy in > trying to figure out how the context can be shared between evictors and > other parts of the system. If we keep evictors I would like to keep the API > and implementation completely separate from anything else that's going on > in the system. > > On implementation, the context would probably created by the WindowOperator > or by the InternalWindowFunction. > > Cheers, > Aljoscha > > On Mon, 12 Sep 2016 at 08:27 AJ Heller <a...@drfloob.com> wrote: > > > Could you point me towards the inspiration for Evictors? Are there any > > papers, perhaps, that lay the groundwork for mutable windows like this? > > > > After much research this weekend, I found that Evictors are unique to > > Flink. Conceptually, it looks to me like Dataflow windows are build-only. > > Looking into other Dataflow implementations: I didn't find anything in > > either the Apache Beam SDK docs or the Google Cloud Dataflow API docs > that > > mention allowing you to remove elements from a window. I'm hesitant to > > tread new ground in mutability. > > > > What do you think about reimplementing Evictors as a kind of cyclic > filter > > operation? Would it be possible? I believe this would fit into the > Dataflow > > model better, but I'm still in the early stages of becoming familiar with > > Flink, and I haven't read the ABS paper [1] yet to know if there are > > snapshot implications. I also don't (yet) see why you couldn't optimize > > such a cyclic operation with mutable operations under the hood. > > > > [1]: http://arxiv.org/abs/1506.08603 > > > > > > On Fri, Sep 9, 2016 at 11:46 AM, AJ Heller <a...@drfloob.com> wrote: > > > >> Thank you for offering your support, I'm excited to dig in! > >> > >> I have some work to do getting up to speed on the windowing internals. > >> And I still need to get my bearing on the Evictor changes, I plan to > read > >> through the list archive and documents today. Vishnu, are your changes > >> already publicly viewable? > >> > >> Regarding the window modifications in FLIP-2, I see Vishnu that you've > >> suggested an interface for the EvictorContext object, and Aljoscha, you > >> suggested an abstract Context class. Does it make sense for them to > agree? > >> The other big difference I've seen in the signatures is wheter the > Window > >> is contained in the context or not. > >> > >> Have you considered modifying the signature of the methods to accept `<C > >> extends Context>` or `<EC extends EvictorContext>`? At least in terms of > >> FLIP-2, this would allow each process window function to define and work > >> with its own context (without downcasting, anyway), and similarly in the > >> future, there'd be less work in changing Context subclasses when new > >> abstract methods are added to Context. > >> > >> But I may be getting ahead of myself. Could you point me towards where > >> contexts are/would be created? I'm not clear on the ownership and > lifecycle > >> of these objects yet. > >> > > > > >