Hi,
The log is not attached. I'm assuming your topic has a replication factor
greater than 1 so that it is available from another Broker if the
partition leader fails. Try adding
props.put("acks", "all");
to your producer and run your experiment again. If you configured your
topic to have --replication-factor 3 and your brokers (or topic itself) is
configured with min.insync.replicas=2 for example then your producer will
require acknowledgment of receipt of each message from 2 of your 3 brokers
with the 'acks=all' property in use making your topic resilient.
Hope this helps,
Tom Aley
[email protected]
From: "许志峰" <[email protected]>
To: [email protected]
Date: 01/03/2018 08:59
Subject: When a broker down, Producer LOST messages!
Hi all,
I have a kafka cluster with 3 nodes: node1, node2, node3
kafka version is 0.8.2.1, which I can not change!
A Producer writes msg to kafka, its code framework is like this in
pseudo-code:
Properties props = new Properties();
props.put("bootstrap.servers", "node1:9092,node2:9092,node3:9092");
props.put(key serializer and value serializer);
for(i = 0; i < 10; ++i){
producer = new Producer(props);
msg = "this is msg " + i;
producer.send(msg);
producer.close()
}
After the first 4 messages are send successfully, I killed broker on
node1. the 5th and 6th messages are LOST.
Producer first get the broker list from the PropertyConfig, i.e. [node1,
node2, node3], then producer choose one broker, connect with it and get
METADATA from it.
I heard that when one broker in the list is unavailable, the kafka client
will change to another
But in my case, if the broker choose node1, which is already dead, it will
get a Fetch MetaData Timeout Exception and STOPPED! msg is not writed into
Kafka.
Attached is the complete Log. you can only focus on the colorful lines.
you can see that, I wrote 10 msgs to Kafka, the first 4 succeed, when I
kill one broker, msg5 and msg6 are LOST, because the choose NODE1,
msg7,8,9,10 are succeed because they did not choose node1.
I checkout the Kafka source codes and get nothing.
Do anybody know the reason? where are the related classes/functions
located in the source code?
Any clue will be appreciated!
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU