Hello Users, I have a question that is perhaps not best solved within Flink: It has to do with notifying a downstream application that a Flink window has completed.
The (simplified) scenario is this: - We have a Flink job that consumes from Kafka, does some preprocessing, and then has a sliding window of 10 minutes and slide time of 1 minute. - The number of keys in each slide is not fixed - The output of the window is then output to Kafka, which is read by a downstream application. What I want to achieve is that the downstream application can someone know when it has read all of the data for a single window, without waiting for the next window to arrive. Some options I've considered: - Producing a second window over the window results that counts the output size, which can then be used by the downstream application to see when it has received the same number: This seems fragile, as there it relies on there being no loss or duplication of data. Its also an extra window and Kafka stream which is a tad messy. - Somehow adding an 'end of window' element to each partitions of the Kafka topic which can be read by the consumer: This seems a bit messy because it mixes different types of events into the same Kafka stream, and there is no really simple way to do this in Flink - Package the whole window output into a single message and make this the unit of transaction: This is possible, but the message would be quite large then (at least 10s of mb), as the volume of this stream is quite large. - Assume that if the consumer has no elements to read, or if the next window has started to be read, then it has read the whole window: This seems reasonable, and if it wasn't for the fact that my consumer on the application end was a bit inflexible right now, it is probably the solution I would use. Any further/better ideas? Thanks Padarn