Hi Dawid,
Thank you for the response, I see your point. I was perhaps thinking only
from the perspective of my use case where I think such a definition makes
sense and did not account for the general case.

Regards,
Manas


On Thu, Mar 26, 2020 at 8:40 PM Dawid Wysakowicz <dwysakow...@apache.org>
wrote:

> Hi Manas,
>
> First of all I think your understanding of how the session windows work
> is correct.
>
> I tend to slightly disagree that the end for a session window is wrong.
> It is my personal opinion though. I see it this way that a TimeWindow in
> case of a session window is the session itself. The session always ends
> after a period of inactivity. Take a user session on a webpage. Such a
> session does not end/isn't brought down at the time of a last event. It
> is closed after a period of inactivity. In such scenario I think the
> behavior of the session window is correct.
>
> Moreover you can achieve what you are describing with an aggregate[1]
> function. You can easily maintain the biggest number seen for a window.
>
> Lastly, I think the overall feeling in the community is that we are very
> skeptical towards extending the Windows API. From what I've heard and
> experienced the ProcessFunction[2] is a much better principle to build
> custom solutions upon, as in fact its easier to control and even
> understand. That said I am rather against introducing that change.
>
> Best,
>
> Dawid
>
> [1]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#aggregatefunction
>
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#process-function-low-level-operations
>
> On 13/03/2020 09:46, Manas Kale wrote:
> > Hi all,
> > I would like to start a discussion on this feature request (JIRA link).
> > <https://issues.apache.org/jira/browse/FLINK-16039>
> >
> > Consider the events :
> >
> > [1, event], [2, event]
> >
> > where first element is event timestamp in seconds and second element is
> > event code/name.
> >
> > Also consider that an Event time session window with inactivityGap = 2
> > seconds is acting on above stream.
> >
> > When the first event arrives, a session window should be created that is
> > [1,1].
> >
> > When the second event arrives, a new session window should be created
> that
> > is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it
> > should be merged into session window [1,2] and  [2,2] should be deleted.
> >
> >
> > *This is my understanding of how session windows are created. Please
> > correct me if wrong.*
> > However, Flink does not follow such a definition of windows semantically.
> > If I call the  getEnd() method of the TimeWindow() class, I get back
> > timestamp + inactivityGap.
> >
> > For the above example, after processing the first element, I would get 1
> +
> > 2 = 3 seconds as the window "end".
> >
> > The actual window end should be the timestamp 1, which is the last event
> in
> > the session window.
> >
> > A solution would be to change the "end" definition of all windows, but I
> > suppose this would be breaking and would need some debate.
> >
> > Therefore, I propose an intermediate solution : add a new API method that
> > keeps track of the last element added in the session window.
> >
> > If there is agreement on this, I would like to start drafting a change
> > document and implement this.
> >
>
>

Reply via email to