Please don't take this comment the wrong way, but have you double-checked whether your counting code is working correctly? (I'm not implying this could be the only reason for what you're observing.)
-Michael On Fri, Nov 18, 2016 at 4:52 PM, Eno Thereska <eno.there...@gmail.com> wrote: > Hi Ryan, > > Perhaps you could share some of your code so we can have a look? One thing > I'd check is if you are using compacted Kafka topics. If so, and if you > have non-unique keys, compaction happens automatically and you might only > see the latest value for a key. > > Thanks > Eno > > On 18 Nov 2016, at 13:49, Ryan Slade <r...@avocet.io> wrote: > > > > Hi > > > > I'm trialling Kafka Streaming for a large stream processing job, however > > I'm seeing message loss even in the simplest scenarios. > > > > I've tried to boil it down to the simplest scenario where I see loss > which > > is the following: > > 1. Ingest messages from an input stream (String, String) > > 2. Decode message into a type from JSON > > 3. If succesful, send to a second stream and update an atomic counter. > > (String, CustomType) > > 4. A foreach on the second stream that updates an AtomicCounter each > time a > > message arrives. > > > > I would expect that since we have at least once guarantees that the > second > > stream would see at least as many messages as were sent to it from the > > first, however I consistently see message loss. > > > > I've tested multiple times sending around 200k messages. I don't see > losses > > every time, maybe around 1 in 5 runs with the same data. The losses are > > small, around 100 messages, but I would expect none. > > > > I'm running version 0.10.1.0 with Zookeeper, Kafka and the Stream > Consumer > > all running on the same machine in order to mitigate packet loss. > > > > I'm running Ubuntu 16.04 with OpenJDK. > > > > Any advice would be greatly appreciated as I can't move forward with > Kafka > > Streams as a solution if messages are consistently lost between stream on > > the same machine. > > > > Thanks > > Ryan > >