Hi, I opened this Jira which should help in implementing the Trigger DSL but is also independent in that it just enhances the range of things that can be done with a Trigger: https://issues.apache.org/jira/browse/FLINK-4415
Cheers, Aljoscha On Wed, 17 Aug 2016 at 14:38 Jark Wu <wuchong...@alibaba-inc.com> wrote: > Hi Aljoscha, Kostas, thanks for your detailed explanation. It makes sense. > > According to the discarding and accumulating, the FLIP says “the mode of > parent trigger overwrites that of its children”. That means Trigger decide > whether to discard window contents after firing , right ? But I find the > origin google doc[1] proposed the Trigger only decide whether to fire or > not while the purging behavior is determined by a setting on > WindowedStream. Such as : > > datastream.keyBy(0) > .window(windowAssigner) > .trigger(compositeTrigger) > .accumulating() > > > [1] > https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u > < > https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u > > > > - Jark Wu > > > 在 2016年8月17日,下午6:12,Aljoscha Krettek <aljos...@apache.org> 写道: > > > > Hi, > > I think that would blow up state since there can be several triggers that > > need this kind of state, Any and All come to mind, possibly. If each of > > those keeps state that's at least a byte per trigger. If the finished > state > > were kept centrally by the TriggerRunner it would just be one byte for > > everything, in most cases. > > > > As I said, in some cases keeping that extra bit can be avoided. For > > example, if you have Repeat.forever(Some.trigger()) you know that the > > finished bit will always be false and so you don't keep any state in the > > TriggerRunner. If every trigger manually does that bookkeeping you remove > > that possibility while increasing complexity in each Trigger > implementation. > > > > Cheers, > > Aljoscha > > > > On Wed, 17 Aug 2016 at 12:05 Kostas Kloudas <k.klou...@data-artisans.com > > > > wrote: > > > >> Hi Aljoscha, > >> > >> On the Repeat.? addition, I think that each trigger will have to have > >> its own implementation, e.g. the CountTrigger should just set a dummy > >> value in the counter in order to know if it should fire again or not. > >> > >> In other case, we will have to add more state and this can lead to > >> significant > >> performance degradation, as in most cases this state has to be checked > on > >> every element. > >> > >> Another potential solution, which I am not sure if it covers all cases, > >> could > >> be to have a State abstraction like CompositeState, apart from the > >> Value, List, Reduce, Fold, which can fetch more than one types of state > >> with one round trip to the backend. Imagine having the “counter" and the > >> “canceled” states in the same entry in the backend and always fetch them > >> together. This can lead to zero additional cost for the extra state. > >> > >> What do you think? > >> > >> Kostas > >> > >>> On Aug 17, 2016, at 11:57 AM, Aljoscha Krettek <aljos...@apache.org> > >> wrote: > >>> > >>> Regarding Repeat.forever() and the default being to not repeat. The > >> simple > >>> reason is that Beam (née Google Dataflow) provides basically the same > >> thing > >>> with their trigger DSL and that their triggers behave like this. I > think > >> it > >>> would not be beneficial to have the same feature in two systems in that > >>> space where the behavior is the opposite. That would make it confusing > >> for > >>> users. > >>> > >>> On the implementation side, I think in most cases you need to have a > way > >> of > >>> telling when triggers are finished or not anyways. There could be a > >> central > >>> component in the TriggerRunner that has a finished bit for every > trigger > >> in > >>> the tree. In most cases this would be a simple byte. Triggers could set > >> and > >>> query this finished bit. In some cases, where you know that triggers > can > >>> never finish you could have a dummy implementation of the finished set > >> that > >>> does not store any state and always returns false when queried. > >>> > >>> On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <aljos...@apache.org> > >> wrote: > >>> > >>>> Kostas already nicely explained this! > >>>> > >>>> I just want to give some theoretical background. I see the underlying > >> idea > >>>> of triggers similar to predicates, i.e. > >>>> > >> > "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)" > >>>> translates to a predicate "(E and ET) or WT" (where E is a predicate > >> that > >>>> is true when we are in early phase, ET is the early trigger and WT is > >> the > >>>> watermark trigger). The other trigger translates to "(!E and LT) or > WT", > >>>> i.e. it triggers if we're not early and LT is true or if the watermark > >>>> trigger is true. If we combine the two we get: > >>>> > >>>> ((E and ET) or WT) and ((!E and LT) or WT) > >>>> > >>>> now we can eliminate the two parts with E and !E because they can > never > >> be > >>>> true and are in an "or": > >>>> > >>>> WT and WT > >>>> > >>>> which yield just "WT". > >>>> > >>>> Hope that makes sense to you. > >>>> > >>>> Cheers, > >>>> Aljoscha > >>>> > >>>> > >>>> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas < > >> k.klou...@data-artisans.com> > >>>> wrote: > >>>> > >>>>> Hello Jark Wu, > >>>>> > >>>>> Both of them will work in the new DSL. The idea is that there should > >> be no > >>>>> restrictions on the combinations one can do. > >>>>> > >>>>> Coming to what does the early and the late trigger do, the early > >> trigger > >>>>> will > >>>>> be responsible for specifying when the trigger should fire in the > >> period > >>>>> between > >>>>> the beginning of the window and the time when the watermark passes > the > >> end > >>>>> of the window. The late trigger takes over after the watermark passes > >> the > >>>>> end of > >>>>> the window, and specifies when the trigger should fire in the period > >>>>> between the > >>>>> endOfWindow and endOfWindow + allowedLateness. > >>>>> > >>>>> So in the case of the: > >>>>> All(EventTimeTrigger.afterEndOfWindow() > >>>>> .withEarlyTrigger(earlyFiringTrigger), > >>>>> EventTimeTrigger.afterEndOfWindow() > >>>>> .withLateTrigger(lateFiringTrigger)) > >>>>> > >>>>> The trigger will only fire at the end of the window, as this is the > >> only > >>>>> time both > >>>>> triggers will say FIRE. > >>>>> > >>>>> Although the above will work, the example that you gave is a nice one > >> as > >>>>> it > >>>>> degenerates to an: > >>>>> > >>>>> EventTimeTrigger.afterEndOfWindow() > >>>>> > >>>>> Detecting this and giving the simplest trigger for the job can lead > to > >>>>> further > >>>>> optimizations, as it can for example reduce the amount of state the > >>>>> trigger has to keep. > >>>>> > >>>>> That would actually be a very nice addition to have as in some cases > it > >>>>> can lead > >>>>> to performance improvements. > >>>>> > >>>>> Thanks for the feedback! > >>>>> > >>>>> Kostas > >>>>> > >>>>>> On Aug 17, 2016, at 4:36 AM, Jark Wu <wuchong...@alibaba-inc.com> > >>>>> wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> It’s a cool design, I really like it ! I have two questions here. > >>>>>> > >>>>>> The first is whether do we have the complex composite triggers, i.e. > >>>>> nested All and Any. Such as : > >>>>>> > >>>>>> Any( > >>>>>> All(trigger1, trigger2), > >>>>>> Any(trigger3, trigger4) > >>>>>> ) > >>>>>> > >>>>>> Can the above code work? > >>>>>> > >>>>>> Another question is : In composite triggers, what’s the behavior of > >>>>> withEarlyTrigger and withLateTrigger ? For example, > >>>>>> > >>>>>> All(EventTimeTrigger.afterEndOfWindow() > >>>>>> .withEarlyTrigger(earlyFiringTrigger), > >>>>>> EventTimeTrigger.afterEndOfWindow() > >>>>>> .withLateTrigger(lateFiringTrigger)) > >>>>>> > >>>>>> Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both > >>>>> work ? > >>>>>> > >>>>>> > >>>>>> - Jark Wu > >>>>>> > >>>>>>> 在 2016年8月17日,上午12:24,Kostas Kloudas <k.klou...@data-artisans.com> > >> 写道: > >>>>>>> > >>>>>>> Hi Aljoscha, > >>>>>>> > >>>>>>> Thanks for the feedback! > >>>>>>> > >>>>>>> It is a nice feature to have. The reason it is not included in the > >> FLIP > >>>>>>> is that I have not seen somebody asking for something similar in > the > >>>>>>> mailing list. > >>>>>>> > >>>>>>> A point that I have to add is that it seems (from the user ML) that > >>>>>>> most of the times users expect the “Repeated.forever” behavior to > >>>>>>> be the default. > >>>>>>> > >>>>>>> Given this, I would say that we should make this the default and > >>>>>>> add something like “Repeat.Once” option which will just let the > >> trigger > >>>>>>> fire once, e.g. the first time the counter reaches 5 in your > example, > >>>>>>> and then stop. > >>>>>>> > >>>>>>> In other case, the trigger specification may become too verbose, > >>>>>>> as the user will have to write the “Repeat.forever” for all child > >>>>> triggers. > >>>>>>> > >>>>>>> What do you think? > >>>>>>> > >>>>>>> Kostas > >>>>>>> > >>>>>>>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek < > aljos...@apache.org> > >>>>> wrote: > >>>>>>>> > >>>>>>>> Ah, I just read the document again and noticed that it might be > good > >>>>> to > >>>>>>>> differentiate between repeatable triggers and non-repeating > >> triggers. > >>>>> I'm > >>>>>>>> proposing to make most triggers non-repeating with the addition > of a > >>>>>>>> trigger that makes other triggers repeatable. > >>>>>>>> > >>>>>>>> Example Non-Repeating: > >>>>>>>> EventTimeTrigger.pastEndOfWindow() > >>>>>>>> .withEarlyFiring(CountTrigger.of(5)) > >>>>>>>> > >>>>>>>> this gives me an early firing once I got 5 elements and then an > >>>>> on-time > >>>>>>>> firing once the watermark passes the end of the window. > >>>>>>>> > >>>>>>>> Example with Repeating: > >>>>>>>> EventTimeTrigger.pastEndOfWindow() > >>>>>>>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5))) > >>>>>>>> > >>>>>>>> this gives me early firings whenever I see 5 new elements plus the > >>>>>>>> watermark firing. > >>>>>>>> > >>>>>>>> What do you think? > >>>>>>>> > >>>>>>>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas < > >>>>> k.klou...@data-artisans.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks Till! > >>>>>>>>> > >>>>>>>>> Kostas > >>>>>>>>> > >>>>>>>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann < > trohrm...@apache.org> > >>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Cool design doc Klou. It's well described with a lot of > details. I > >>>>> like > >>>>>>>>> it > >>>>>>>>>> a lot :-) +1 for implementing the trigger DSL. > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Till > >>>>>>>>>> > >>>>>>>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas < > >>>>>>>>> k.klou...@data-artisans.com > >>>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Thanks for the feedback Ufuk! > >>>>>>>>>>> I will do that. > >>>>>>>>>>> > >>>>>>>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <u...@apache.org> > >> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Hey Kostas! Thanks for sharing the documents. I think it makes > >>>>> sense > >>>>>>>>>>>> to merge the two documents by moving the Google doc contents > to > >>>>> the > >>>>>>>>>>>> Wiki. I think they form one unit. > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas > >>>>>>>>>>>> <k.klou...@data-artisans.com> wrote: > >>>>>>>>>>>>> Hi all! > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've created a FLIP for the trigger DSL. This is the triggers > >>>>>>>>>>>>> that we want Apache Flink to support out-of-the-box. This > >>>>> proposal > >>>>>>>>>>>>> builds on various discussions on the mailing list and aims at > >>>>>>>>>>>>> serving as a base for further ones. > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>> > >>>>> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL > >>>>>>>>>>> < > >>>>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL> > >>>>>>>>>>>>> > >>>>>>>>>>>>> FLIP-9 provides a description of the triggers Flink already > >>>>> offers, > >>>>>>>>>>>>> the new that we think should be added, how the APIs could > look > >>>>> like, > >>>>>>>>>>>>> some discussion on the implementation implications and some > >> ideas > >>>>>>>>>>>>> on how to implement them. > >>>>>>>>>>>>> > >>>>>>>>>>>>> There is also a shared document giving a bit more insight on > >> the > >>>>>>>>>>> implementation > >>>>>>>>>>>>> implications. Feel free to read but please keep the > discussion > >>>>> in the > >>>>>>>>>>> mailing list. > >>>>>>>>>>>>> > >>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/ > >>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing < > >>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/ > >>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I would like to start working on an the implementation next > >> week. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Let the discussion begin! > >>>>>>>>>>>>> > >>>>>>>>>>>>> Kostas > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>> > >>>>> > >>>>> > >> > >> > >