Hi Clive, The behaviour you are seeing is indeed correct (though not necessarily optimal in terms of performance as described in this JIRA: https://issues.apache.org/jira/browse/KAFKA-3101 <https://issues.apache.org/jira/browse/KAFKA-3101>)
The key observation is that windows never close/complete. There could always be late arriving events that appear long after a window's end interval and those need to be accounted for properly. In Kafka Streams that means that such late arriving events continue to update the value of the window. As described in the above JIRA, some optimisations could still be possible (e.g., batch requests as described in KIP-63 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-63:+Unify+store+and+downstream+caching+in+streams>), however they are not implemented yet. So your code needs to handle each update. Thanks Eno > On 13 Jun 2016, at 11:13, Clive Cox <clivej...@yahoo.co.uk.INVALID> wrote: > > Hi, > I would like to process a stream with a tumbling window of 5secs, create > aggregated stats for keys and push the final aggregates at the end of each > window period to a analytics backend. I have tried doing something like: > stream > .map > .reduceByKey(... > , TimeWindows.of("mywindow", 5000L),...) > .foreach { send stats > } > But I get every update to the ktable in the foreach. > How do I just get the final values once the TumblingWindow is complete so I > can iterate over them and send to some external system? > Thanks, > Clive > PS Using kafka_2.10-0.10.0.0 >