Hello, I have two data streams, and want to join them using a tumbling window. Each of the streams would have at most one record per window. There is also a requirement to log/save the records that don't have a companion from the other stream. What would be the best option for my case? Would that be possible to use Flink's Join?
I tried to use CoProcessFunction: truncating the timestamp of each record to the beginning of the tumbling window, and then "keyBy" the two streams using (key, truncated-timestamp). When I receive a record from one stream, if that's the first record of the pair, then I save it to a MapState. If it is the 2nd record, then I merge with the 1st one then fire. This implementation works, but (a) I feel that it's over-complicated, and (b) I have a concern that when one stream is slower than the other, my cached data would build up and make my cluster out-of-memory. Would back-pressure kicks in for this case? Thanks and best regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/