In the Kafka official document you may want more than one, though, in case a server is down
But when there are invalid brokers in the bootstrap.servers, the producer may fail! I Set up one broker 10.142.33.51:9092 on CentOS 7, using kafka 0.10.1.0. My Producer is like Properties props = new Properties(); props.put("bootstrap.servers", "10.142.233.51:9092, 10.142.233.56:9092"); props.put("retry", 3); props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); KafkaProducer<String, String> producer = new KafkaProducer<String, String>(props); String topic = "kfktest"; // try to fetch the metadata producer.partitionsFor(topic);// or producer.send(new ProducerRecord()) Note that broker 10.142.233.56:9092 does not start or is already dead there is possibility that the producer request metadata from Node2, which is unavailable. So the producer got a "org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms." after repeatedly printing 2018-04-09 13:11:37,171 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -22018-04-09 13:11:37,279 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -22018-04-09 13:11:37,386 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -22018-04-09 13:11:37,467 WARN [org.apache.kafka.common.network.Selector] - Error in I/O with /10.142.233.56 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) at org.apache.kafka.common.network.Selector.poll(Selector.java:238) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:192) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:191) at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:122) at java.lang.Thread.run(Thread.java:745)2018-04-09 13:11:37,468 DEBUG [org.apache.kafka.clients.NetworkClient] - Node -2 disconnected.2018-04-09 13:11:37,486 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -22018-04-09 13:11:37,486 DEBUG [org.apache.kafka.clients.NetworkClient] - Init connection to node -2 for sending metadata request in the next iteration2018-04-09 13:11:37,487 DEBUG [org.apache.kafka.clients.NetworkClient] - Initiating connection to node -2 at 10.142.233.56:9092.2018-04-09 13:11:37,487 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -22018-04-09 13:11:37,589 DEBUG [org.apache.kafka.clients.NetworkClient] - Trying to send metadata request to node -2 Then the Producer is terminated. If you replace the partitionFor function with a send(), the msg will not be send to kafka, namely LOST. This scenario is reproducible. The producer was supposed to send request to another node(Node1) in the bootstrap.servers list. But it didn't. I checked the source code, only find in package org.apache.kafka.clients.NetworkClient#DefaultMetadataUpdater.maybeUpdate(long, Node), which chooses a leastLoadedNode to send metadata request. When a broker in the bootstrap.servers is down, Will the producer turn to another node? If so, where are the related source code located? otherwise, why the document says " you may want more than one, though, in case a server is down " in the bootstrap.servers description ? Thanks for your attention! Best Regards! Xu