I wrote a little unit test and it works as I expect on linux (ubuntu, redhat, openjdk 1.8).
It fails regularly on a colleagues' OSX build environment with Oracle jdk1.8.0_45. The intent of the test is to write out a file of random lines. Each line is sent into a kafka as a message. A single consumer reads these messages and writes them to a file. If the consumer waits longer than a second receiving data, it assumes that all the messages have been delivered, and closes the output file; the test compares the input and output files. On OSX, the output file is always short. 1) am I making unreasonable assumptions in the test? I know that its legal for the consuming thread to fall asleep for a second, or for kafka to wait and not deliver the message for a second, but is it reasonable? There's nothing else going on, no swapping, plenty of free disk and memory. 2) are there known differences in OS/JVM behaviors that might affect the test? 3) we have experimented with longer waits for all the messages, but have not found anything that works 100% of the time. I am concerned that my basic understanding of how kafka is supposed to work is fundamentally flawed. For example, does kafka not provide all available messages unless a buffer size is met? Did I miss some flush or commit call that is required to see all messages? Is there a setting to reduce the wait time for the last messages? I think apache strips out attachments, so I've made the code available here <https://drive.google.com/open?id=0ByzSB9b42jf0aDZYRnNJdWx1OHc> as a simple stand-alone maven package (.tar.gz). Unpack, review, and run with `mvn verify`. Thanks! -Eric