Hi Manas, First of all I think your understanding of how the session windows work is correct.
I tend to slightly disagree that the end for a session window is wrong. It is my personal opinion though. I see it this way that a TimeWindow in case of a session window is the session itself. The session always ends after a period of inactivity. Take a user session on a webpage. Such a session does not end/isn't brought down at the time of a last event. It is closed after a period of inactivity. In such scenario I think the behavior of the session window is correct. Moreover you can achieve what you are describing with an aggregate[1] function. You can easily maintain the biggest number seen for a window. Lastly, I think the overall feeling in the community is that we are very skeptical towards extending the Windows API. From what I've heard and experienced the ProcessFunction[2] is a much better principle to build custom solutions upon, as in fact its easier to control and even understand. That said I am rather against introducing that change. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#aggregatefunction [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#process-function-low-level-operations On 13/03/2020 09:46, Manas Kale wrote: > Hi all, > I would like to start a discussion on this feature request (JIRA link). > <https://issues.apache.org/jira/browse/FLINK-16039> > > Consider the events : > > [1, event], [2, event] > > where first element is event timestamp in seconds and second element is > event code/name. > > Also consider that an Event time session window with inactivityGap = 2 > seconds is acting on above stream. > > When the first event arrives, a session window should be created that is > [1,1]. > > When the second event arrives, a new session window should be created that > is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it > should be merged into session window [1,2] and [2,2] should be deleted. > > > *This is my understanding of how session windows are created. Please > correct me if wrong.* > However, Flink does not follow such a definition of windows semantically. > If I call the getEnd() method of the TimeWindow() class, I get back > timestamp + inactivityGap. > > For the above example, after processing the first element, I would get 1 + > 2 = 3 seconds as the window "end". > > The actual window end should be the timestamp 1, which is the last event in > the session window. > > A solution would be to change the "end" definition of all windows, but I > suppose this would be breaking and would need some debate. > > Therefore, I propose an intermediate solution : add a new API method that > keeps track of the last element added in the session window. > > If there is agreement on this, I would like to start drafting a change > document and implement this. >
signature.asc
Description: OpenPGP digital signature