Hi AJ, the idea for evictors initially came from IBM Infosphere Streams, if I'm not mistaken: http://www.ibm.com/support/knowledgecenter/SSCRJU_4.0.0/com.ibm.streams.dev.doc/doc/windowhandling.html The first version of the windowing system used a combination of triggers/evictors to do the windowing, this is describe in Jonas Traub's thesis: http://www.diva-portal.se/smash/get/diva2:861798/FULLTEXT01.pdf.
I'm quite skeptical about having support for Evictors in the first place. They make computation inefficient because you always have to keep a list of all elements and cannot incrementally aggregate using a reduce function. Also, it is quite tricky to figure out how to do eviction based on ProcessingTime with a good interface. If you have some ideas how this could be improved I'm open to anything. For now, I would suggest to focus on FLIP-2, since quite a number of people would be interested in having that. I would also not put any energy in trying to figure out how the context can be shared between evictors and other parts of the system. If we keep evictors I would like to keep the API and implementation completely separate from anything else that's going on in the system. On implementation, the context would probably created by the WindowOperator or by the InternalWindowFunction. Cheers, Aljoscha On Mon, 12 Sep 2016 at 08:27 AJ Heller <a...@drfloob.com> wrote: > Could you point me towards the inspiration for Evictors? Are there any > papers, perhaps, that lay the groundwork for mutable windows like this? > > After much research this weekend, I found that Evictors are unique to > Flink. Conceptually, it looks to me like Dataflow windows are build-only. > Looking into other Dataflow implementations: I didn't find anything in > either the Apache Beam SDK docs or the Google Cloud Dataflow API docs that > mention allowing you to remove elements from a window. I'm hesitant to > tread new ground in mutability. > > What do you think about reimplementing Evictors as a kind of cyclic filter > operation? Would it be possible? I believe this would fit into the Dataflow > model better, but I'm still in the early stages of becoming familiar with > Flink, and I haven't read the ABS paper [1] yet to know if there are > snapshot implications. I also don't (yet) see why you couldn't optimize > such a cyclic operation with mutable operations under the hood. > > [1]: http://arxiv.org/abs/1506.08603 > > > On Fri, Sep 9, 2016 at 11:46 AM, AJ Heller <a...@drfloob.com> wrote: > >> Thank you for offering your support, I'm excited to dig in! >> >> I have some work to do getting up to speed on the windowing internals. >> And I still need to get my bearing on the Evictor changes, I plan to read >> through the list archive and documents today. Vishnu, are your changes >> already publicly viewable? >> >> Regarding the window modifications in FLIP-2, I see Vishnu that you've >> suggested an interface for the EvictorContext object, and Aljoscha, you >> suggested an abstract Context class. Does it make sense for them to agree? >> The other big difference I've seen in the signatures is wheter the Window >> is contained in the context or not. >> >> Have you considered modifying the signature of the methods to accept `<C >> extends Context>` or `<EC extends EvictorContext>`? At least in terms of >> FLIP-2, this would allow each process window function to define and work >> with its own context (without downcasting, anyway), and similarly in the >> future, there'd be less work in changing Context subclasses when new >> abstract methods are added to Context. >> >> But I may be getting ahead of myself. Could you point me towards where >> contexts are/would be created? I'm not clear on the ownership and lifecycle >> of these objects yet. >> > >