Hi, Marc I think the window operator might fulfill your needs. You could find the detailed description here[1] In general, I think you could choose the correct type of window and use the `ProcessWindowFunction` to emit the elements that match the best sum.
[1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html <https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#using-per-window-state-in-processwindowfunction> Best, Guowei ba <marc.de...@baranalytics.co.uk> 于2020年5月20日周三 下午9:58写道: > Hi All, > > I'm new to Flink but am trying to write an application that processes data > from internet connected sensors. > > My problem is as follows: > > -Data arrives in the format: [sensor id] [timestamp in seconds] [sensor > value] > -Data can arrive out of order (between sensor IDs) by upto 5 minutes. > -So a stream of data could be: > [1] [100] [20] > [2] [101] [23] > [1] [105] [31] > [1] [140] [17] > > -Each sensor can sometimes split its measurements, and I'm hoping to 'put > them back together' within Flink. For data to be 'put back together' it > must > have a timestamp within 90 seconds of the timestamp on the first piece of > data. The data must also be put back together in order, in the example > above > for sensor 1 you could only have combinations of (A) the first reading on > its own (time 100), (B) the first and third item (time 100 and 105) or (C) > the first, third and fourth item (time 100, 105, 140). The second item is a > different sensor so not considered in this exercise. > > -I would like to write something that tries different 'sum' combinations > within the 90 second limit and outputs the best 'match' to expected values. > In the example above lets say the expected sum values are 50 or 100. Of the > three combinations I mentioned for sensor 1, the sum would be 20, 51, or > 68. > Therefore the 'best' combination is 51 as it is closest to 50 or 100, so it > would output two data items: [1] [100] [20] and [1] [105] [31], with the > last item left in the stream and matched with any other data points that > arrive after. > > I am thinking some sort of iterative function that does this, but am not > sure how to emit the values I want and keep other values that were > considered (but not used) in the stream. > > Any ideas or help is really appreciated? > > Thanks, > Marc > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >