Hello Scott, as you see, the logs are telling you what is happening: The size of the current ISR Set(1) is insufficient to satisfy the min.isr requirement of 2 for partition <FOO> .
It means you either increase the replication factor > 2 or decrease the min.isr to 1. Since currently your replication factor is 2 and the min.isr is 2 as well, you can not afford to lose one Broker. I guest your producers are using ack = all or -1 Hope that helps. Cheers! -- Jonathan On Wed, Jun 5, 2019 at 7:25 PM Hamer, Scott <scott.ha...@bestbuy.com> wrote: > I've have some questions regarding Kafka ISR's and the number of servers > one should have. > > > In our production environment we have a 6 node Kafka cluster with > min.insync.replicas=2, and > > all of our topics have a ReplicationFactor: of 2 > > > When we create topics, we do not pin them to a specific broker. A 10 > partition looks like this: > > > Topic:<FOO> PartitionCount:10 ReplicationFactor:2 Configs: > > Topic: <FOO> Partition: 0 Leader: 6 Replicas: 6,5 Isr: 6,5 > Topic: <FOO> Partition: 1 Leader: 1 Replicas: 1,6 Isr: 1,6 > Topic: <FOO> Partition: 2 Leader: 2 Replicas: 2,1 Isr: 1,2 > Topic: <FOO> Partition: 3 Leader: 3 Replicas: 3,2 Isr: 3,2 > Topic: <FOO> Partition: 4 Leader: 4 Replicas: 4,3 Isr: 4,3 > Topic: <FOO> Partition: 5 Leader: 5 Replicas: 5,4 Isr: 5,4 > Topic: <FOO> Partition: 6 Leader: 6 Replicas: 6,1 Isr: 1,6 > Topic: <FOO> Partition: 7 Leader: 1 Replicas: 1,2 Isr: 1,2 > Topic: <FOO> Partition: 8 Leader: 2 Replicas: 2,3 Isr: 3,2 > Topic: <FOO> Partition: 9 Leader: 3 Replicas: 3,4 Isr: 4,3 > > > > Whenever we lose 1 node, the cluster will not allow anyone to produce and > the entire cluster fills our logs with: > > > The size of the current ISR Set(1) is insufficient to satisfy the min.isr > requirement of 2 for partition <FOO> . > > > > > > Do I need more brokers to sustain losing 1 ? > > > Is my partition schema out of whack ? > > > Is the a formula out there that describes the number of brokers to isr's > to replicas ? > > > Should I tweak the replica.fetch.wait.max.ms ? > > > > Any guidance will be highly appreciated. > > > --shamer > > > -- Santilli Jonathan