I've have some questions regarding Kafka ISR's and the number of servers one should have.
In our production environment we have a 6 node Kafka cluster with min.insync.replicas=2, and all of our topics have a ReplicationFactor: of 2 When we create topics, we do not pin them to a specific broker. A 10 partition looks like this: Topic:<FOO> PartitionCount:10 ReplicationFactor:2 Configs: Topic: <FOO> Partition: 0 Leader: 6 Replicas: 6,5 Isr: 6,5 Topic: <FOO> Partition: 1 Leader: 1 Replicas: 1,6 Isr: 1,6 Topic: <FOO> Partition: 2 Leader: 2 Replicas: 2,1 Isr: 1,2 Topic: <FOO> Partition: 3 Leader: 3 Replicas: 3,2 Isr: 3,2 Topic: <FOO> Partition: 4 Leader: 4 Replicas: 4,3 Isr: 4,3 Topic: <FOO> Partition: 5 Leader: 5 Replicas: 5,4 Isr: 5,4 Topic: <FOO> Partition: 6 Leader: 6 Replicas: 6,1 Isr: 1,6 Topic: <FOO> Partition: 7 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: <FOO> Partition: 8 Leader: 2 Replicas: 2,3 Isr: 3,2 Topic: <FOO> Partition: 9 Leader: 3 Replicas: 3,4 Isr: 4,3 Whenever we lose 1 node, the cluster will not allow anyone to produce and the entire cluster fills our logs with: The size of the current ISR Set(1) is insufficient to satisfy the min.isr requirement of 2 for partition <FOO> . Do I need more brokers to sustain losing 1 ? Is my partition schema out of whack ? Is the a formula out there that describes the number of brokers to isr's to replicas ? Should I tweak the replica.fetch.wait.max.ms ? Any guidance will be highly appreciated. --shamer