Thanks Eno for your comments and references. Perhaps, I can explain what I want to achieve and maybe you can suggest the correct topology? I want process a stream of events and do aggregation and send to an analytics backend (Influxdb), so that rather than sending 1000 points/sec to the analytics backend, I send a much lower value. I'm only interested in using the processing time of the event so in that respect there are no "late arriving" events.I was hoping I could use a Tumbling window which when its end-time had been passed I can send the consolidated aggregation for that window and then throw the Window away.
It sounds like from the references you give that this is not possible at present in Kafka Streams? Thanks, Clive On Monday, 13 June 2016, 11:32, Eno Thereska <eno.there...@gmail.com> wrote: Hi Clive, The behaviour you are seeing is indeed correct (though not necessarily optimal in terms of performance as described in this JIRA: https://issues.apache.org/jira/browse/KAFKA-3101 <https://issues.apache.org/jira/browse/KAFKA-3101>) The key observation is that windows never close/complete. There could always be late arriving events that appear long after a window's end interval and those need to be accounted for properly. In Kafka Streams that means that such late arriving events continue to update the value of the window. As described in the above JIRA, some optimisations could still be possible (e.g., batch requests as described in KIP-63 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-63:+Unify+store+and+downstream+caching+in+streams>), however they are not implemented yet. So your code needs to handle each update. Thanks Eno > On 13 Jun 2016, at 11:13, Clive Cox <clivej...@yahoo.co.uk.INVALID> wrote: > > Hi, > I would like to process a stream with a tumbling window of 5secs, create >aggregated stats for keys and push the final aggregates at the end of each >window period to a analytics backend. I have tried doing something like: > stream > .map > .reduceByKey(... > , TimeWindows.of("mywindow", 5000L),...) > .foreach { send stats > } > But I get every update to the ktable in the foreach. > How do I just get the final values once the TumblingWindow is complete so I > can iterate over them and send to some external system? > Thanks, > Clive > PS Using kafka_2.10-0.10.0.0 >