Hi, 1. Your topic partitions are not replicated (replication factor =1). Increase replication factor for better fault tolerance. With proper replication, Kafka Brokers/Producers can handle node failures without data loss.
2. Looks like Kafka brokers are not in a cluster. They might be configured with different Zookeeper clusters. All Kafka servers should be configured with same zookeeper cluster. Check your ZK cluster. On Tue, Mar 29, 2016 at 4:57 AM, Eric Hyunwoo Na <e...@relcy.com> wrote: > My company has been running Kafka on a three-node cluster. Let us call the > nodes master, slave1, slave2. > > All three nodes are running Kafka. > > However, I found out I/O is screwed up in the mounting partition (`/mnt/`) > on the master broker, and even the root cannot read, write, or execute any > of the file there. It is strange how Kafka is still running. > > The other two brokers are fine, but I think only one of them is actually > functioning. > > I want to replace the corrupted disk on the master, and then re-enable > Kafka on the master. > > From my understanding, when I kill Kafka on the master, one of the > followers will elect itself as a leader, and it should work fine. > > My concern is, > > 1. Keeping sending messages to master when Kafka is off might mess up the > master server. (I pass a comma separated list of all three brokers in the > consumer, but I need to make sure it's safe) > > 2. The cluster might be poorly configured so that it's not a three-node > cluster, but actually three one-node cluster, and mater would not be > fault-tolerant. > > For example, on slave 1, > > $ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic > metric.topic > Topic:metric.topic PartitionCount:1 ReplicationFactor:1 Configs: > Topic: metric.topic Partition: 0 Leader: 1 Replicas: 1 Isr: 1 > > on slave 2, > > Topic:metric.topic PartitionCount:1 ReplicationFactor:1 Configs: > Topic: metric.topic Partition: 0 Leader: 2 Replicas: 2 Isr: 2 > > (I cannot check this for master, because of I/O permission is messed up > there) > > These two seem to run separately, although they receive the same messages > from the producers. > > How can I make sure these two things would not happen? > > Especially, where in the Kafka documentation are they addressing my concern > #1? > > -- > Best, > > Eric >