Your proposal could probably also be implemented by using Flink's support for allowed lateness when defining a window [1]. It has basically the same idea that there might be some elements which violate the watermark semantics and which need to be handled separately.
[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html#late-elements Cheers, Till On Thu, Apr 25, 2019 at 10:21 AM Congxian Qiu <qcx978132...@gmail.com> wrote: > There was someone working in IoT asking me whether Flink supports per-key > watermark also. > > I’m not sure if we can do the statistics by using raw state manipulating. > We create a single state for every single key, and when receiving a key, we > extract the timestamp and to see if we need to send some result to the > downside(like the trigger action in window), and we can also have tolerant > the come delay data. > > > Best, Congxian > On Apr 25, 2019, 01:58 +0800, Lasse Nedergaard <lassenederga...@gmail.com>, > wrote: > > Thanks Till > > What about this workaround. > If I after the watermark assignment split the stream in elements that fits > in the watermark (s1) and those that don’t (s2). The s1 I process with the > table api with a window aggregate using watermark and s2 I handle with an > unbounded non-windows aggregate with IdleStateRentionTime so state are > removed when my devices are up to date again. I then merge the two outputs > and continue. > By doing this I handle 99% as standard and only keeping state for the late > data. > > Make sense? And would it work? > > Med venlig hilsen / Best regards > Lasse Nedergaard > > > Den 24. apr. 2019 kl. 19.00 skrev Till Rohrmann <trohrm...@apache.org>: > > Hi Lasse, > > at the moment this is not supported out of the box by Flink. The community > thought about this feature but so far did not implement it. Unfortunately, > I'm also not aware of an easy workaround one could do in the user code > space. > > Cheers, > Till > > On Wed, Apr 24, 2019 at 3:26 PM Lasse Nedergaard < > lassenederga...@gmail.com> wrote: > >> Hi. >> >> We work with IoT data and we have cases where the IoT-device delay data >> transfer if it can't get network access. We would like to use table windows >> aggregate function over each device to calculate some statistics, but for >> windows aggregate functions to work we need to assign a watermark. This >> watermark is general for all devices. We can set allow latency, but we >> can't set it to months. >> So what we need is to have a watermark for each device (key by) so the >> window aggregate work on the timestamp delivered for the device and not the >> global watermark. >> Is that possible, or have anyone consider this feature? >> >> Best >> >> Lasse Nedergaard >> >> >