I've have some questions regarding Kafka ISR's and the number of servers one
should have.
In our production environment we have a 6 node Kafka cluster with
min.insync.replicas=2, and
all of our topics have a ReplicationFactor: of 2
When we create topics, we do not pin them to a specific broker. A 10
partition looks like this:
Topic:<FOO> PartitionCount:10 ReplicationFactor:2 Configs:
Topic: <FOO> Partition: 0 Leader: 6 Replicas: 6,5 Isr: 6,5
Topic: <FOO> Partition: 1 Leader: 1 Replicas: 1,6 Isr: 1,6
Topic: <FOO> Partition: 2 Leader: 2 Replicas: 2,1 Isr: 1,2
Topic: <FOO> Partition: 3 Leader: 3 Replicas: 3,2 Isr: 3,2
Topic: <FOO> Partition: 4 Leader: 4 Replicas: 4,3 Isr: 4,3
Topic: <FOO> Partition: 5 Leader: 5 Replicas: 5,4 Isr: 5,4
Topic: <FOO> Partition: 6 Leader: 6 Replicas: 6,1 Isr: 1,6
Topic: <FOO> Partition: 7 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: <FOO> Partition: 8 Leader: 2 Replicas: 2,3 Isr: 3,2
Topic: <FOO> Partition: 9 Leader: 3 Replicas: 3,4 Isr: 4,3
Whenever we lose 1 node, the cluster will not allow anyone to produce and the
entire cluster fills our logs with:
The size of the current ISR Set(1) is insufficient to satisfy the min.isr
requirement of 2 for partition <FOO> .
Do I need more brokers to sustain losing 1 ?
Is my partition schema out of whack ?
Is the a formula out there that describes the number of brokers to isr's to
replicas ?
Should I tweak the replica.fetch.wait.max.ms ?
Any guidance will be highly appreciated.
--shamer