Hi Dominik, Was the job running with processing time or event time? If event time, how are you producing the watermarks? Normally to understand how windows are firing in Flink, these two factors would be the place to look at. I can try to further explain this once you provide info with these. Also, are you using Kafka 0.10?
Cheers, Gordon On March 27, 2017 at 11:25:49 PM, Dominik Safaric (dominiksafa...@gmail.com) wrote: Hi all, Lately I’ve been investigating onto the performance characteristics of Flink part of our internal benchmark. Part of this we’ve developed and deployed an application that pools data from Kafka, groups the data by a key during a fixed time window of a minute. In total, the topic that the KafkaConsumer pooled from consists of 100 million messages each of 100 bytes size. What we were expecting is that no records will be neither read nor produced back to Kafka for the first minute of the window operation - however, this is unfortunately not the case. Below you may find a plot showing the number of records produced per second. Could anyone provide an explanation onto the behaviour shown in the graph below? What are the reasons behind consuming/producing messages from/to Kafka while the window has not expired yet?