Hi Manas,

First of all I think your understanding of how the session windows work
is correct.

I tend to slightly disagree that the end for a session window is wrong.
It is my personal opinion though. I see it this way that a TimeWindow in
case of a session window is the session itself. The session always ends
after a period of inactivity. Take a user session on a webpage. Such a
session does not end/isn't brought down at the time of a last event. It
is closed after a period of inactivity. In such scenario I think the
behavior of the session window is correct.

Moreover you can achieve what you are describing with an aggregate[1]
function. You can easily maintain the biggest number seen for a window.

Lastly, I think the overall feeling in the community is that we are very
skeptical towards extending the Windows API. From what I've heard and
experienced the ProcessFunction[2] is a much better principle to build
custom solutions upon, as in fact its easier to control and even
understand. That said I am rather against introducing that change.

Best,

Dawid

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#aggregatefunction

[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#process-function-low-level-operations

On 13/03/2020 09:46, Manas Kale wrote:
> Hi all,
> I would like to start a discussion on this feature request (JIRA link).
> <https://issues.apache.org/jira/browse/FLINK-16039>
>
> Consider the events :
>
> [1, event], [2, event]
>
> where first element is event timestamp in seconds and second element is
> event code/name.
>
> Also consider that an Event time session window with inactivityGap = 2
> seconds is acting on above stream.
>
> When the first event arrives, a session window should be created that is
> [1,1].
>
> When the second event arrives, a new session window should be created that
> is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it
> should be merged into session window [1,2] and  [2,2] should be deleted.
>
>
> *This is my understanding of how session windows are created. Please
> correct me if wrong.*
> However, Flink does not follow such a definition of windows semantically.
> If I call the  getEnd() method of the TimeWindow() class, I get back
> timestamp + inactivityGap.
>
> For the above example, after processing the first element, I would get 1 +
> 2 = 3 seconds as the window "end".
>
> The actual window end should be the timestamp 1, which is the last event in
> the session window.
>
> A solution would be to change the "end" definition of all windows, but I
> suppose this would be breaking and would need some debate.
>
> Therefore, I propose an intermediate solution : add a new API method that
> keeps track of the last element added in the session window.
>
> If there is agreement on this, I would like to start drafting a change
> document and implement this.
>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to