Hi,
I have some thoughts about Evictors as well yes, but I didn’t yet write them 
down. The basic idea about them is this:

class Evictor {
   Predicate getPredicate(Iterable<StreamRecord<T>> elements, int size, W 
window);
}

class Predicate {
  boolean evict(StreamRecord<T> element);
}

The evictor will return a predicate that is evaluated on every element in the 
buffer to decide whether we should keep it or not. The predicate can keep 
internal state. So with the size it gets in getPredicate() it can do count 
based eviction (just evict elements until you reach your desired quota). We can 
also do eviction based on event-time which was not possible before because you 
could only evict from the start of the buffer. What do you think?

Cheers,
Aljoscha
> On 22 Mar 2016, at 09:24, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Thanks for the write-up Aljoscha.
> I think it is a really good idea to separate the different aspects (fire, 
> purging, lateness) a bit. At the moment, all of these need to be handled in 
> the Trigger and a custom trigger is necessary whenever, you want some of 
> these aspects slightly differently handled. This makes the Trigger interface 
> and implementations of it really hard to understand.
> 
> +1 for the suggested changes. 
> Are there plans to touch the Evictor interface as well? IMO, this needs a 
> redesign as well.
> 
> Fabian
> 
> 2016-03-21 19:21 GMT+01:00 Aljoscha Krettek <aljos...@apache.org>:
> Hi,
> my previous message might be a bit hard to parse for people that are not very 
> deep into the Trigger implementation. So I’ll try to give a bit more 
> explanation right in the mail.
> 
> The basic idea is that we observed some basic problems that keep coming up 
> for people on the mailing lists and I want to try and address them.
> 
> The first problem is with the Trigger semantics and the confusion between 
> triggers that purge the window contents and those that don’t. (For example, 
> using a ContinuousEventTimeTrigger with EventTimeWindows assigner is a bad 
> idea because state will be kept indefinitely.) While working on this we 
> should also tacke the issue of providing composite triggers such as 
> Repeatedly (fires a child-trigger repeatedly), Any (fires when any child 
> trigger fires) and All (fires when all child triggers fire).
> 
> Lateness. Right now, it is possible to write custom triggers that can deal 
> with late elements and can even behave differently based on the amount of 
> lateness. There is, however, no API for dealing with lateness. We should 
> address this.
> 
> The third issue is Trigger testability. We should introduce a testing harness 
> for triggers and move the processing time triggers to use a clock provider 
> instead of directly using System.currentTimeMillis(). This will allow testing 
> them deterministically.
> 
> All of these are expanded upon in the document I linked to before: 
> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit?usp=sharing
>  I think all of this is very important for people working on event-time based 
> pipelines.
> 
> Feedback is very welcome and I hope that we can expand the document together 
> and come up with good solutions.
> 
> Cheers,
> Aljoscha
> > On 21 Mar 2016, at 17:46, Aljoscha Krettek <aljos...@apache.org> wrote:
> >
> > Hi,
> > I’m also sending this to @user because the Trigger API concerns users 
> > directly.
> >
> > There are some things in the Trigger API that I think require some 
> > improvements. The issues are trigger testability, fire semantics and 
> > composite triggers and lateness. I started a document to keep track of 
> > things 
> > (https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit?usp=sharing).
> >  Please read it if you are interested and want to get involved in this. 
> > We’ll evolve the document together and come up with Jira issues for the 
> > subtasks.
> >
> > Cheers,
> > Aljoscha
> 
> 

Reply via email to