Hi We have been using Kafka(0.8) for the past few months with the following setup Kafka Broker - 1Zookeepers Ensemble - 3Partitions per topic - 3
Yesterday, while running Stress tests in one of the QA machines , we observed that a few messages which were produced within a couple of milliseconds of each other did not reach the Kafka consumer. ie There was no trace of that message at the consumer end. We decided to check whether we had any errors at our side or there was a network issue. We did not find any issue. We then decided to check whether we can find that message in one of the Kafka partitions. The message was found in one of the topic partitions. We are not sure why Kafka did not notify any consumers about the message. Are there any special cases where Kafka silently drops a message ? We also found a delay in the notifications/watches triggered from zookeeper. We are not sure whether these are related ? It will be difficult to reproduce as the test probably took a few days to complete. But surely we did lose approximately 5% of the messages. We have logs of messages being produced at the producer side and corresponding entries in Kafka partitions logs. But nothing at the consumer side. The only repeating pattern was that the messages were probably produced within the same millisecond. So if you have a sequence of messages which was produced in the same millisecond like M0, M1, M2, M3 ie 4 messages. We probably have M0,M1,M3 but not M2. This is puzzling as to how only message is dropped out of the given 4. We use the High Level Kafka Producer and Consumer. Both are single threaded(at our end). Does kafka need its own dedicated zookeeper ensemble ? We also use the same zookeeper ensemble as our configuration service. Unfortunately, we did not have DEBUG messages at the server enabled during the setup. Although, NO error messages were observed during the same time period. Before we try running the same Tests again, can someone please shed more light as to the reasons why kafka dropped a few messages ? Kat