Re: [DISCUSS] FLIP-9: Trigger DSL

Aljoscha Krettek Wed, 17 Aug 2016 08:32:21 -0700

Hi,
I opened this Jira which should help in implementing the Trigger DSL but is
also independent in that it just enhances the range of things that can be
done with a Trigger:
https://issues.apache.org/jira/browse/FLINK-4415


Cheers,
Aljoscha

On Wed, 17 Aug 2016 at 14:38 Jark Wu <wuchong...@alibaba-inc.com> wrote:

> Hi Aljoscha, Kostas, thanks for your detailed explanation. It makes sense.
>
> According to the discarding and accumulating, the FLIP says “the mode of
> parent trigger overwrites that of its children”. That means Trigger decide
> whether to discard window contents after firing , right ?  But I find the
> origin google doc[1] proposed the Trigger only decide whether to fire or
> not while the purging behavior is determined by a setting on
> WindowedStream. Such as :
>
> datastream.keyBy(0)
>                   .window(windowAssigner)
>                   .trigger(compositeTrigger)
>                   .accumulating()
>
>
> [1]
> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u
> <
> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u
> >
>
> - Jark Wu
>
> > 在 2016年8月17日，下午6:12，Aljoscha Krettek <aljos...@apache.org> 写道：
> >
> > Hi,
> > I think that would blow up state since there can be several triggers that
> > need this kind of state, Any and All come to mind, possibly. If each of
> > those keeps state that's at least a byte per trigger. If the finished
> state
> > were kept centrally by the TriggerRunner it would just be one byte for
> > everything, in most cases.
> >
> > As I said, in some cases keeping that extra bit can be avoided. For
> > example, if you have Repeat.forever(Some.trigger()) you know that the
> > finished bit will always be false and so you don't keep any state in the
> > TriggerRunner. If every trigger manually does that bookkeeping you remove
> > that possibility while increasing complexity in each Trigger
> implementation.
> >
> > Cheers,
> > Aljoscha
> >
> > On Wed, 17 Aug 2016 at 12:05 Kostas Kloudas <k.klou...@data-artisans.com
> >
> > wrote:
> >
> >> Hi Aljoscha,
> >>
> >> On the Repeat.? addition, I think that each trigger will have to have
> >> its own implementation, e.g. the CountTrigger should just set a dummy
> >> value in the counter in order to know if it should fire again or not.
> >>
> >> In other case, we will have to add more state and this can lead to
> >> significant
> >> performance degradation, as in most cases this state has to be checked
> on
> >> every element.
> >>
> >> Another potential solution, which I am not sure if it covers all cases,
> >> could
> >> be to have a State abstraction like CompositeState, apart from the
> >> Value, List, Reduce, Fold, which can fetch more than one types of state
> >> with one round trip to the backend. Imagine having the “counter" and the
> >> “canceled” states in the same entry in the backend and always fetch them
> >> together. This can lead to zero additional cost for the extra state.
> >>
> >> What do you think?
> >>
> >> Kostas
> >>
> >>> On Aug 17, 2016, at 11:57 AM, Aljoscha Krettek <aljos...@apache.org>
> >> wrote:
> >>>
> >>> Regarding Repeat.forever() and the default being to not repeat. The
> >> simple
> >>> reason is that Beam (née Google Dataflow) provides basically the same
> >> thing
> >>> with their trigger DSL and that their triggers behave like this. I
> think
> >> it
> >>> would not be beneficial to have the same feature in two systems in that
> >>> space where the behavior is the opposite. That would make it confusing
> >> for
> >>> users.
> >>>
> >>> On the implementation side, I think in most cases you need to have a
> way
> >> of
> >>> telling when triggers are finished or not anyways. There could be a
> >> central
> >>> component in the TriggerRunner that has a finished bit for every
> trigger
> >> in
> >>> the tree. In most cases this would be a simple byte. Triggers could set
> >> and
> >>> query this finished bit. In some cases, where you know that triggers
> can
> >>> never finish you could have a dummy implementation of the finished set
> >> that
> >>> does not store any state and always returns false when queried.
> >>>
> >>> On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <aljos...@apache.org>
> >> wrote:
> >>>
> >>>> Kostas already nicely explained this!
> >>>>
> >>>> I just want to give some theoretical background. I see the underlying
> >> idea
> >>>> of triggers similar to predicates, i.e.
> >>>>
> >>
> "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)"
> >>>> translates to a predicate "(E and ET) or WT" (where E is a predicate
> >> that
> >>>> is true when we are in early phase, ET is the early trigger and WT is
> >> the
> >>>> watermark trigger). The other trigger translates to "(!E and LT) or
> WT",
> >>>> i.e. it triggers if we're not early and LT is true or if the watermark
> >>>> trigger is true. If we combine the two we get:
> >>>>
> >>>> ((E and ET) or WT) and ((!E and LT) or WT)
> >>>>
> >>>> now we can eliminate the two parts with E and !E because they can
> never
> >> be
> >>>> true and are in an "or":
> >>>>
> >>>> WT and WT
> >>>>
> >>>> which yield just "WT".
> >>>>
> >>>> Hope that makes sense to you.
> >>>>
> >>>> Cheers,
> >>>> Aljoscha
> >>>>
> >>>>
> >>>> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas <
> >> k.klou...@data-artisans.com>
> >>>> wrote:
> >>>>
> >>>>> Hello Jark Wu,
> >>>>>
> >>>>> Both of them will work in the new DSL. The idea is that there should
> >> be no
> >>>>> restrictions on the combinations one can do.
> >>>>>
> >>>>> Coming to what does the early and the late trigger do, the early
> >> trigger
> >>>>> will
> >>>>> be responsible for specifying when the trigger should fire in the
> >> period
> >>>>> between
> >>>>> the beginning of the window and the time when the watermark passes
> the
> >> end
> >>>>> of the window. The late trigger takes over after the watermark passes
> >> the
> >>>>> end of
> >>>>> the window, and specifies when the trigger should fire in the period
> >>>>> between the
> >>>>> endOfWindow and endOfWindow + allowedLateness.
> >>>>>
> >>>>> So in the case of the:
> >>>>>       All(EventTimeTrigger.afterEndOfWindow()
> >>>>>                               .withEarlyTrigger(earlyFiringTrigger),
> >>>>>                EventTimeTrigger.afterEndOfWindow()
> >>>>>                               .withLateTrigger(lateFiringTrigger))
> >>>>>
> >>>>> The trigger will only fire at the end of the window, as this is the
> >> only
> >>>>> time both
> >>>>> triggers will say FIRE.
> >>>>>
> >>>>> Although the above will work, the example that you gave is a nice one
> >> as
> >>>>> it
> >>>>> degenerates to an:
> >>>>>
> >>>>>       EventTimeTrigger.afterEndOfWindow()
> >>>>>
> >>>>> Detecting this and giving the simplest trigger for the job can lead
> to
> >>>>> further
> >>>>> optimizations, as it can for example reduce the amount of state the
> >>>>> trigger has to keep.
> >>>>>
> >>>>> That would actually be a very nice addition to have as in some cases
> it
> >>>>> can lead
> >>>>> to performance improvements.
> >>>>>
> >>>>> Thanks for the feedback!
> >>>>>
> >>>>> Kostas
> >>>>>
> >>>>>> On Aug 17, 2016, at 4:36 AM, Jark Wu <wuchong...@alibaba-inc.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> It’s a cool design, I really like it !  I have two questions here.
> >>>>>>
> >>>>>> The first is whether do we have the complex composite triggers, i.e.
> >>>>> nested All and Any. Such as :
> >>>>>>
> >>>>>> Any(
> >>>>>> All(trigger1, trigger2),
> >>>>>> Any(trigger3, trigger4)
> >>>>>> )
> >>>>>>
> >>>>>> Can the above code work?
> >>>>>>
> >>>>>> Another question is ： In composite triggers, what’s the behavior of
> >>>>> withEarlyTrigger and withLateTrigger ? For example,
> >>>>>>
> >>>>>> All(EventTimeTrigger.afterEndOfWindow()
> >>>>>>                               .withEarlyTrigger(earlyFiringTrigger),
> >>>>>>   EventTimeTrigger.afterEndOfWindow()
> >>>>>>                               .withLateTrigger(lateFiringTrigger))
> >>>>>>
> >>>>>> Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both
> >>>>> work  ?
> >>>>>>
> >>>>>>
> >>>>>> - Jark Wu
> >>>>>>
> >>>>>>> 在 2016年8月17日，上午12:24，Kostas Kloudas <k.klou...@data-artisans.com>
> >> 写道：
> >>>>>>>
> >>>>>>> Hi Aljoscha,
> >>>>>>>
> >>>>>>> Thanks for the feedback!
> >>>>>>>
> >>>>>>> It is a nice feature to have. The reason it is not included in the
> >> FLIP
> >>>>>>> is that I have not seen somebody asking for something similar in
> the
> >>>>>>> mailing list.
> >>>>>>>
> >>>>>>> A point that I have to add is that it seems (from the user ML) that
> >>>>>>> most of the times users expect the “Repeated.forever” behavior to
> >>>>>>> be the default.
> >>>>>>>
> >>>>>>> Given this, I would say that we should make this the default and
> >>>>>>> add something like “Repeat.Once” option which will just let the
> >> trigger
> >>>>>>> fire once, e.g. the first time the counter reaches 5 in your
> example,
> >>>>>>> and then stop.
> >>>>>>>
> >>>>>>> In other case, the trigger specification may become too verbose,
> >>>>>>> as the user will have to write the “Repeat.forever” for all child
> >>>>> triggers.
> >>>>>>>
> >>>>>>> What do you think?
> >>>>>>>
> >>>>>>> Kostas
> >>>>>>>
> >>>>>>>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek <
> aljos...@apache.org>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>> Ah, I just read the document again and noticed that it might be
> good
> >>>>> to
> >>>>>>>> differentiate between repeatable triggers and non-repeating
> >> triggers.
> >>>>> I'm
> >>>>>>>> proposing to make most triggers non-repeating with the addition
> of a
> >>>>>>>> trigger that makes other triggers repeatable.
> >>>>>>>>
> >>>>>>>> Example Non-Repeating:
> >>>>>>>> EventTimeTrigger.pastEndOfWindow()
> >>>>>>>> .withEarlyFiring(CountTrigger.of(5))
> >>>>>>>>
> >>>>>>>> this gives me an early firing once I got 5 elements and then an
> >>>>> on-time
> >>>>>>>> firing once the watermark passes the end of the window.
> >>>>>>>>
> >>>>>>>> Example with Repeating:
> >>>>>>>> EventTimeTrigger.pastEndOfWindow()
> >>>>>>>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5)))
> >>>>>>>>
> >>>>>>>> this gives me early firings whenever I see 5 new elements plus the
> >>>>>>>> watermark firing.
> >>>>>>>>
> >>>>>>>> What do you think?
> >>>>>>>>
> >>>>>>>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas <
> >>>>> k.klou...@data-artisans.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks Till!
> >>>>>>>>>
> >>>>>>>>> Kostas
> >>>>>>>>>
> >>>>>>>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann <
> trohrm...@apache.org>
> >>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Cool design doc Klou. It's well described with a lot of
> details. I
> >>>>> like
> >>>>>>>>> it
> >>>>>>>>>> a lot :-) +1 for implementing the trigger DSL.
> >>>>>>>>>>
> >>>>>>>>>> Cheers,
> >>>>>>>>>> Till
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas <
> >>>>>>>>> k.klou...@data-artisans.com
> >>>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for the feedback Ufuk!
> >>>>>>>>>>> I will do that.
> >>>>>>>>>>>
> >>>>>>>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <u...@apache.org>
> >> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hey Kostas! Thanks for sharing the documents. I think it makes
> >>>>> sense
> >>>>>>>>>>>> to merge the two documents by moving the Google doc contents
> to
> >>>>> the
> >>>>>>>>>>>> Wiki. I think they form one unit.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
> >>>>>>>>>>>> <k.klou...@data-artisans.com> wrote:
> >>>>>>>>>>>>> Hi all!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've created a FLIP for the trigger DSL. This is the triggers
> >>>>>>>>>>>>> that we want Apache Flink to support out-of-the-box. This
> >>>>> proposal
> >>>>>>>>>>>>> builds on various discussions on the mailing list and aims at
> >>>>>>>>>>>>> serving as a base for further ones.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>
> >>>>>
> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL
> >>>>>>>>>>> <
> >>>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> FLIP-9 provides a description of the triggers Flink already
> >>>>> offers,
> >>>>>>>>>>>>> the new that we think should be added, how the APIs could
> look
> >>>>> like,
> >>>>>>>>>>>>> some discussion on the implementation implications and some
> >> ideas
> >>>>>>>>>>>>> on how to implement them.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> There is also a shared document giving a bit more insight on
> >> the
> >>>>>>>>>>> implementation
> >>>>>>>>>>>>> implications. Feel free to read but please keep the
> discussion
> >>>>> in the
> >>>>>>>>>>> mailing list.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
> >>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing <
> >>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
> >>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I would like to start working on an the implementation next
> >> week.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Let the discussion begin!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Kostas
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>
> >>
>
>

Re: [DISCUSS] FLIP-9: Trigger DSL

Reply via email to