Re: Kafka Streams Produced Wrong (duplicated) Results with Simple Windowed Aggregation Case

2018-06-04 Thread EC Boost
Thanks for your help. For sliding windows the changelog as output behaviour is as expected. But for non-overlapping windows most of the use cases expect micro-batch semantics ( no intermediate changelog as output, only final aggregation in the window). Any examples for reference to implement micro

Re: Kafka Streams Produced Wrong (duplicated) Results with Simple Windowed Aggregation Case

2018-06-04 Thread John Roesler
Hi EC, Thanks for the very clear report and question. Like Guozhang said this is expected (but not ideal) behavior. For an immediate work-around, you can try materializing the KTable and setting the commit interval and cache size as discussed here ( https://www.confluent.io/blog/watermarks-tables

Re: Kafka Streams Produced Wrong (duplicated) Results with Simple Windowed Aggregation Case

2018-06-04 Thread Guozhang Wang
Hello, Your observation is correct, Kafka Streams by default will print continuous updates to each window, instead of waiting for the "final" update for each window. There are some ongoing work to provide the functionality to allow users specify sth. like "give me the final result for windowed ag

Re: Kafka Streams Produced Wrong (duplicated) Results with Simple Windowed Aggregation Case

2018-06-04 Thread EC Boost
Logged the internal windows information: Window{start=152804303, end=152804304} key=t6 1 Window{start=152804304, end=152804305} key=t1 2 Window{start=152804304, end=152804305} key=t7 3 Window{start=152804304, end=152804305} key=t5 4 Window{start=152804304, e

Kafka Streams Produced Wrong (duplicated) Results with Simple Windowed Aggregation Case

2018-06-03 Thread EC Boost
Hello Everyone, I got duplicated results using kstreams for simple windowed aggregation. The input event format is comma seperated: "event_id,event_type" and I need to aggregate them by event type. Following is the Kafka Stream processing logic: events .map((k, v) -> KeyValue.pair(v.spl