Re: [DISCUSS] Allowed Lateness in Flink

2016-07-28 Thread Kostas Kloudas
t; -Original Message----- >> From: Aljoscha Krettek [mailto:aljos...@apache.org] >> Sent: Thursday, July 28, 2016 3:47 PM >> To: dev@flink.apache.org >> Subject: Re: [DISCUSS] Allowed Lateness in Flink >> >> Another (maybe completely crazy) idea is to regard the tr

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-28 Thread Aljoscha Krettek
6 3:47 PM > To: dev@flink.apache.org > Subject: Re: [DISCUSS] Allowed Lateness in Flink > > Another (maybe completely crazy) idea is to regard the triggers really as > a DSL and use compiler techniques to derive a state machine that you use to > do the actual triggering. > > Wit

RE: [DISCUSS] Allowed Lateness in Flink

2016-07-28 Thread Radu Tudoran
good. Will you start a FLIP document for this? -Original Message- From: Aljoscha Krettek [mailto:aljos...@apache.org] Sent: Thursday, July 28, 2016 3:47 PM To: dev@flink.apache.org Subject: Re: [DISCUSS] Allowed Lateness in Flink Another (maybe completely crazy) idea is to regard the

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-28 Thread Aljoscha Krettek
Another (maybe completely crazy) idea is to regard the triggers really as a DSL and use compiler techniques to derive a state machine that you use to do the actual triggering. With this, the "trigger" objects that make up the tree of triggers would not contain any logic themselves. A trigger speci

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-26 Thread Kostas Kloudas
And also I think that the shouldFire has to take as an additional argument the time. This will help differentiate between ON_TIME and EARLY, LATE firings. > On Jul 26, 2016, at 11:02 AM, Kostas Kloudas > wrote: > > Hello, > > This is a nice proposal that I think covers most of the cases. > The

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-26 Thread Kostas Kloudas
Hello, This is a nice proposal that I think covers most of the cases. The only thing that is missing would be a method: void onFire(Window window, TriggerContext ctx) that will be responsible for doing whatever is necessary if the windowOperator decides to fire. You can imagine resetting the c

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-25 Thread Aljoscha Krettek
Hi, yes, this is essentially the solution I had in my head but I went a bit further and generalized it. Basically, to make triggers composable they should have this interface, let's call it SimpleTrigger for now: class SimpleTrigger { void onElement(T element, long timestamp, W window, TriggerC

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-25 Thread Kostas Kloudas
Hi Aljoscha, This was exactly one of the problems I also found. The way I was thinking about it is the following: Conceptually, time (event and processing) advances but state is a fixed property of the window. Given this, I modified the Count trigger to also ask for the current state (count)

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-25 Thread Aljoscha Krettek
These are some very interesting thoughts! I have some more, based on these: What happens if you have for example this Trigger: All(Watermark.pastEndOfWindow(), Count.atLeast(10)) When would this even fire, i.e. what are the steps that lead to this combined trigger firing with the Trigger system t

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-22 Thread Kostas Kloudas
Forgot to say that the signature for the onFire() that I think fits should be: void onFire(Window window, TriggerContext ctx) throws Exception; > On Jul 22, 2016, at 12:47 PM, Kostas Kloudas > wrote: > > Hi, > > I started working on the new triggers proposed here and so far I can see > two s

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-22 Thread Kostas Kloudas
Hi, I started working on the new triggers proposed here and so far I can see two shortcomings in the current state of the triggers that do not play well with the new proposals, and more specifically the composite triggers All and Any. So here it goes: 1) In the document posted above, there a

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-21 Thread Aljoscha Krettek
I'm proposing to get this small change into 1.1: https://issues.apache.org/jira/browse/FLINK-4239 This will make our lives easier with the future proposed changes. What do you think? On Tue, 19 Jul 2016 at 11:41 Aljoscha Krettek wrote: > Hi, > these new features should make it into the 1.2 relea

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-19 Thread Aljoscha Krettek
Hi, these new features should make it into the 1.2 release. We are already working on releasing 1.1 so it won't make it for that one. unfortunately. Cheers, Aljoscha On Mon, 18 Jul 2016 at 23:19 Chen Qin wrote: > BTW, do you have rough timeline in term of roll out it to production? > > Thanks,

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Chen Qin
BTW, do you have rough timeline in term of roll out it to production? Thanks, Chen On Mon, Jul 18, 2016 at 2:46 AM, Aljoscha Krettek wrote: > Hi, > Chen commented this on the doc (I'm mirroring here so everyone can follow): > "It would be cool to be able to access last snapshot of window state

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Chen Qin
Aljoscha, Yes, that would works for our case! Chen On Mon, Jul 18, 2016 at 2:46 AM, Aljoscha Krettek wrote: > Hi, > Chen commented this on the doc (I'm mirroring here so everyone can follow): > "It would be cool to be able to access last snapshot of window states > before it get purged. Pipel

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-18 Thread Aljoscha Krettek
Hi, Chen commented this on the doc (I'm mirroring here so everyone can follow): "It would be cool to be able to access last snapshot of window states before it get purged. Pipeline author might consider put it to external storage and deal with late arriving events by restore corresponding window."

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-10 Thread Chen Qin
Sure. Currently, it looks like any element assigned to a too late window will be dropped silently😓 ? Having a late window stream imply somehow Flink needs to add one more state to window and split window state cleanup from window retirement. I would suggest as simple as adding a function in trigge

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-08 Thread Aljoscha Krettek
@Chen I added a section at the end of the document regarding access to the elements that are dropped as late. Right now, the section just mentions that we have to do this but there is no real proposal yet for how to do it. Only a rough sketch so that we don't forget about it. On Fri, 8 Jul 2016 at

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-07 Thread Chen Qin
+1 for allowedLateness scenario. The rationale behind is there are backfills or data issues hold in-window data till watermark pass end time. It cause sink write partial output. Allow high allowedLateness threshold makes life easier to merge those results and overwrite partial output with correct

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-07 Thread Kostas Kloudas
Hi, In the effort to move the discussion to the mailing list, rather than the doc, there was a comment in the doc: “It seems this proposal marries the allowed lateness of events and the discarding of window state. In most use cases this should be sufficient, but there are instances where havin

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Aljoscha Krettek
@Vishnu Funny you should ask that because I have a design doc lying around. I'll open a new mail thread to not hijack this one. On Wed, 6 Jul 2016 at 17:17 Vishnu Viswanath wrote: > Hi, > > I was going through the suggested improvements in window, and I have > few questions/suggestion on improve

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Vishnu Viswanath
Hi, I was going through the suggested improvements in window, and I have few questions/suggestion on improvement regarding the Evictor. 1) I am having a use case where I have to create a custom Evictor that will evict elements from the window based on the value (e.g., if I have elements are of ca

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Aljoscha Krettek
I did: https://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3ccanmxww0abttjjg9ewdxrugxkjm7jscbenmvrzohpt2qo3pq...@mail.gmail.com%3e ;-) On Wed, 6 Jul 2016 at 15:31 Ufuk Celebi wrote: > On Wed, Jul 6, 2016 at 3:19 PM, Aljoscha Krettek > wrote: > > In the future, it might be good to

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Ufuk Celebi
On Wed, Jul 6, 2016 at 3:19 PM, Aljoscha Krettek wrote: > In the future, it might be good to to discussions directly on the ML and > then change the document accordingly. This way everyone can follow the > discussion on the ML. I also feel that Google Doc comments often don't give > enough space f

Re: [DISCUSS] Allowed Lateness in Flink

2016-07-06 Thread Aljoscha Krettek
Hi, I cleaned up the document a bit and added sections to address comments on the doc: https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit?usp=sharing (I also marked proposed features that are already implemented as [done].) The main thing that remains to be figure

Re: [DISCUSS] Allowed Lateness in Flink

2016-05-30 Thread Aljoscha Krettek
Thanks for the feedback! :-) I already read the comments on the file. On Mon, 30 May 2016 at 11:10 Gyula Fóra wrote: > Thanks Aljoscha :) I added some comments that might seem relevant from the > users point of view. > > Gyula > > Aljoscha Krettek ezt írta (időpont: 2016. máj. 30., > H, 10:33):

Re: [DISCUSS] Allowed Lateness in Flink

2016-05-30 Thread Gyula Fóra
Thanks Aljoscha :) I added some comments that might seem relevant from the users point of view. Gyula Aljoscha Krettek ezt írta (időpont: 2016. máj. 30., H, 10:33): > Hi, > I created a new doc specifically about the interplay of lateness and > window state garbage collection: > https://docs.goo

Re: [DISCUSS] Allowed Lateness in Flink

2016-05-30 Thread Aljoscha Krettek
Hi, I created a new doc specifically about the interplay of lateness and window state garbage collection: https://docs.google.com/document/d/1vgukdDiUco0KX4f7tlDJgHWaRVIU-KorItWgnBapq_8/edit?usp=sharing There is still some stuff that needs to be figured out, both in the new doc and the existing do

Re: [DISCUSS] Allowed Lateness in Flink

2016-04-26 Thread Aljoscha Krettek
Hi Max, thanks for the Feedback and suggestions! I'll try and address each paragraph separately. I'm afraid deciding based on the "StreamTimeCharacteristic is not possible since a user can use processing-time windows in their job even though the set the characteristic to event-time. Enabling event

Re: [DISCUSS] Allowed Lateness in Flink

2016-04-26 Thread Maximilian Michels
Hi Aljoscha, Thank you for the detailed design document. Wouldn't it be ok to allow these new concepts regardless of the time semantics? For Event Time and Ingestion Time "Lateness" and "Accumulating/Discarding" make sense. If the user chooses Processing time then these can be ignored during tran

Re: [DISCUSS] Allowed Lateness in Flink

2016-04-05 Thread Aljoscha Krettek
By the way. The way I see to fixing this is extending WindowAssigner with an "isEventTime()" method and then allow accumulating/lateness in the WindowOperator only if this is true. But it seems a but hacky because it special cases event-time. But then again, maybe we need to special case it ... O

[DISCUSS] Allowed Lateness in Flink

2016-04-05 Thread Aljoscha Krettek
Hi Folks, as part of my effort to improve the windowing in Flink [1] I also thought about lateness, accumulating/discarding and window cleanup. I have some ideas on this but I would love to get feedback from the community as I think that these things are important for everyone doing event-time wind