It sounds like you want a tumbling time window, rather than a sliding window

https://kafka.apache.org/documentation/streams#streams_dsl_windowing

On 15 June 2017 at 14:38, Paolo Patierno <ppatie...@live.com> wrote:

> Hi,
>
>
> using the streams library I noticed a difference (or there is a lack of
> knowledge on my side)with Apache Spark.
>
> Imagine following scenario ...
>
>
> I have a source topic where numeric values come in and I want to check the
> maximum value in the latest 5 seconds but ... putting the max value into a
> destination topic every 5 seconds.
>
> This is what happens with reduceByWindow method in Spark.
>
> I'm using reduce on a KStream here that process the max value taking into
> account previous values in the latest 5 seconds but the final value is put
> into the destination topic for each incoming value.
>
>
> For example ...
>
>
> An application sends numeric values every 1 second.
>
> With Spark ... the source gets values every 1 second, process max in a
> window of 5 seconds, puts the max into the destination every 5 seconds (so
> when the window ends). If the sequence is 21, 25, 22, 20, 26 the output
> will be just 26.
>
> With Kafka Streams ... the source gets values every 1 second, process max
> in a window of 5 seconds, puts the max into the destination every 1 seconds
> (so every time an incoming value arrives). Of course, if for example the
> sequence is 21, 25, 22, 20, 26 ... the output will be 21, 25, 25, 25, 26.
>
>
> Is it possible with Kafka Streams ? Or it's something to do at application
> level ?
>
>
> Thanks,
>
> Paolo
>
>
> Paolo Patierno
> Senior Software Engineer (IoT) @ Red Hat
> Microsoft MVP on Windows Embedded & IoT
> Microsoft Azure Advisor
>
> Twitter : @ppatierno<http://twitter.com/ppatierno>
> Linkedin : paolopatierno<http://it.linkedin.com/in/paolopatierno>
> Blog : DevExperience<http://paolopatierno.wordpress.com/>
>

Reply via email to