Hi all,
I have a kafka cluster with 3 nodes: node1, node2, node3 *kafka version is 0.8.2.1, which I can not change!* A Producer writes msg to kafka, its code framework is like this in pseudo-code: Properties props = new Properties();props.put("bootstrap.servers", "node1:9092,node2:9092,node3:9092");props.put(key serializer and value serializer); for(i = 0; i < 10; ++i){ producer = new Producer(props); msg = "this is msg " + i; producer.send(msg); producer.close() } After the first 4 messages are send successfully, I killed broker on node1. the 5th and 6th messages are LOST. Producer first get the broker list from the PropertyConfig, i.e. [node1, node2, node3], then* producer choose one broker, connect with it and get METADATA from it. * * I heard that when one broker in the list is unavailable, the kafka client will change to another * *But in my case, **if the broker choose node1, which is already dead, it will get a Fetch MetaData Timeout Exception and STOPPED! msg is not writed into Kafka. * *Attached is the complete Log. you can only focus on the colorful lines.* you can see that, I wrote 10 msgs to Kafka, the first 4 succeed, when I kill one broker, msg5 and msg6 are LOST, because the choose NODE1, msg7,8,9,10 are succeed because they did not choose node1. I checkout the Kafka source codes and get nothing. Do anybody know the reason? where are the related classes/functions located in the source code? Any clue will be appreciated!