This error doesn't necessarily mean that a broker is down, it can also mean that too many replicas for that topic partition have fallen behind the leader. This indicates your replication is lagging for some reason.
You'll want to be monitoring some of the metrics listed here: http://kafka.apache.org/documentation.html#monitoring to help you understand a) when this occurs (e.g. # of under replicated partitions being a critical one) and b) what the cause might be (e.g. saturating network, requests processing slow due to some other resource contention, etc). -Ewen On Fri, Dec 9, 2016 at 5:20 PM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > What's the best way to fix NotEnoughReplication given all the nodes are up > and running? Zookeeper did go down momentarily. We are on Kafka 0.10 > > org.apache.kafka.common.errors.NotEnoughReplicasException: Number of > insync > replicas for partition [__consumer_offsets,20] is [1], below required > minimum [2] > -- Thanks, Ewen