Hi, Marc

I think the window operator might fulfill your needs. You could find the
detailed description here[1]
In general, I think you could choose the correct type of window and use the
`ProcessWindowFunction` to emit the elements that match the best sum.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html
<https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/windows.html#using-per-window-state-in-processwindowfunction>
Best,
Guowei


ba <marc.de...@baranalytics.co.uk> 于2020年5月20日周三 下午9:58写道:

> Hi All,
>
> I'm new to Flink but am trying to write an application that processes data
> from internet connected sensors.
>
> My problem is as follows:
>
> -Data arrives in the format: [sensor id] [timestamp in seconds] [sensor
> value]
> -Data can arrive out of order (between sensor IDs) by upto 5 minutes.
> -So a stream of data could be:
> [1] [100] [20]
> [2] [101] [23]
> [1] [105] [31]
> [1] [140] [17]
>
> -Each sensor can sometimes split its measurements, and I'm hoping to 'put
> them back together' within Flink. For data to be 'put back together' it
> must
> have a timestamp within 90 seconds of the timestamp on the first piece of
> data. The data must also be put back together in order, in the example
> above
> for sensor 1 you could only have combinations of (A) the first reading on
> its own (time 100), (B) the first and third item (time 100 and 105) or (C)
> the first, third and fourth item (time 100, 105, 140). The second item is a
> different sensor so not considered in this exercise.
>
> -I would like to write something that tries different 'sum' combinations
> within the 90 second limit and outputs the best 'match' to expected values.
> In the example above lets say the expected sum values are 50 or 100. Of the
> three combinations I mentioned for sensor 1, the sum would be 20, 51, or
> 68.
> Therefore the 'best' combination is 51 as it is closest to 50 or 100, so it
> would output two data items: [1] [100] [20] and [1] [105] [31], with the
> last item left in the stream and matched with any other data points that
> arrive after.
>
> I am thinking some sort of iterative function that does this, but am not
> sure how to emit the values I want and keep other values that were
> considered (but not used) in the stream.
>
> Any ideas or help is really appreciated?
>
> Thanks,
> Marc
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to