My company has been running Kafka on a three-node cluster. Let us call the nodes master, slave1, slave2.
All three nodes are running Kafka. However, I found out I/O is screwed up in the mounting partition (`/mnt/`) on the master broker, and even the root cannot read, write, or execute any of the file there. It is strange how Kafka is still running. The other two brokers are fine, but I think only one of them is actually functioning. I want to replace the corrupted disk on the master, and then re-enable Kafka on the master. >From my understanding, when I kill Kafka on the master, one of the followers will elect itself as a leader, and it should work fine. My concern is, 1. Keeping sending messages to master when Kafka is off might mess up the master server. (I pass a comma separated list of all three brokers in the consumer, but I need to make sure it's safe) 2. The cluster might be poorly configured so that it's not a three-node cluster, but actually three one-node cluster, and mater would not be fault-tolerant. For example, on slave 1, $ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic metric.topic Topic:metric.topic PartitionCount:1 ReplicationFactor:1 Configs: Topic: metric.topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1 on slave 2, Topic:metric.topic PartitionCount:1 ReplicationFactor:1 Configs: Topic: metric.topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2 (I cannot check this for master, because of I/O permission is messed up there) These two seem to run separately, although they receive the same messages from the producers. How can I make sure these two things would not happen? Especially, where in the Kafka documentation are they addressing my concern #1? -- Best, Eric