Re: Kafka Streams incorrect aggregation results when re-balancing occurs

2019-08-21 Thread Matthias J. Sax
> So with >> exactly_once, it must roll-back commit(s) to the state store in a failure >> scenario? Yes. Dirty writes into the stores are "cleaned up" if you enable exactly-once processing semantics. "commit" and never rolled back, as a commit indicates successful processing :) -Matthias On 8/

Re: Kafka Streams incorrect aggregation results when re-balancing occurs

2019-08-21 Thread Bruno Cadonna
Hi Alex, if you are interested in understanding exactly-once a bit more in detail, I recommend you to watch the following Kafka Summit talk by Matthias https://www.confluent.io/kafka-summit-london18/dont-repeat-yourself-introducing-exactly-once-semantics-in-apache-kafka Best, Bruno On Wed, Aug

Re: Kafka Streams incorrect aggregation results when re-balancing occurs

2019-08-20 Thread Alex Brekken
Thanks guys. I knew that re-processing messages was a possibility with at_least_once processing, but I guess I hadn't considered the potential impact on the state stores as far as aggregations are concerned. So with exactly_once, it must roll-back commit(s) to the state store in a failure scenari

Re: Kafka Streams incorrect aggregation results when re-balancing occurs

2019-08-20 Thread Bruno Cadonna
Hi Alex, what you describe about failing before offsets are committed is one reason why records are processed multiple times under the at-least-once processing guarantee. That is reality of life as you stated. Kafka Streams in exactly-once mode guarantees that this duplicate state updates do not h

RE: Kafka Streams incorrect aggregation results when re-balancing occurs

2019-08-20 Thread Tim Ward
I asked an essentially similar question a week or two ago. The answer was "this is expected behaviour unless you switch on exactly-once processing". (In my case it was solved by changing the topology, which I had to do for other, unconnected, reasons (the requirements for the application changed