I'm trying to run some very simple examples with kafka and running into a 
strange problem.

I've documented this in detail in 
https://issues.apache.org/jira/browse/KAFKA-12594#. The linked repo - 
https://github.com/jamii/streaming-consistency/tree/main/kafka-streams - should 
be runnable on any linux and only depends on nix. 

The short version is:

* I'm reading a table from a single topic with json values
* If I write that table directly back to kafka then I can see the results in 
the output topic, and a `.mapValues(...println...)` on that stream will show 
results in the terminal
  * (On some runs, a continguous subset of ~500 input records don't appear in 
the output)
* If I uncomment any of the other examples, they don't produce any output, the 
write I mentioned in the above bullet also doesn't work and the 
`.mapValues(...println...)` no longer produces any results
* There are no errors in the kafka logs (the full logs are attached in one of 
the comments on the issue)
* The process doesn't terminate or crash.
* Using strace I can see that the demo process reads the entire input topic, 
even though none of the printlns fire.

The symptoms above are reliably reproduced on my laptop (nixos) and on a fresh 
ubuntu 20.10 vm on vultr, with one exception - I once saw the sums topic 
produce records on vultr. That it worked at least once strongly suggests to me 
that there is a race somewhere, rather than eg a silent logic bug in the 
examples. But I don't have enough knowledge of the kafka streams internals to 
track it down any further.

Reply via email to