Replica not available is not a fatal exception. This simply means that there is a replica that is down.
If you get Leader not available that means the partition is offline. -Clark Sent from my iPhone > On Aug 15, 2015, at 8:41 AM, Andrew Otto <ao...@wikimedia.org> wrote: > > Also strange: If I start this broker back up, and then issue a kafkacat > metadata request, I do not see any 'Broker: Replica not available’, even > though this broker’s preferred partitions have not yet replicated back in > sync, and are not the leader. Everything seems normal. > > Somehow this broker being offline makes the rest of the cluster think that > its none of its replicas are available. > > > >> On Aug 15, 2015, at 11:18, Andrew Otto <ao...@wikimedia.org> wrote: >> >> I am having trouble with a single broker causing consumers to lag. As I am >> troubleshooting this issue, I have stopped this broker in the hopes that >> other replicas will take over as leader for this broker’s preferred >> partitions. However, when I do so, Camus reports: >> >> kafka.CamusJob: Skipping the creation of ETL request for Topic : >> webrequest_text and Partition : 3 Exception : >> kafka.common.ReplicaNotAvailableException >> >> kafka-topics.sh —describe shows: >> >> Topic: webrequest_text Partition: 3 Leader: 22 Replicas: 22,21,12 >> Isr: 22,21 >> >> However, when I use kafkacat to look at metadata (which asks for metadata >> from Kafka rather than Zookeeper), I see: >> >> partition 3, leader 22, replicas: 22,21, isrs: 22,21, Broker: Replica not >> available >> >> >> Doh! Clearly there is a replica available. I can use kafkacat and >> kafka-simple-consumer-shell to consume from this partition from either in >> sync replica just fine. >> >> This happens for all partitions for whom the stopped broker was previously >> the leader. >> >> Anyone know why I’d see something like this? I have not seen this error >> before upgrading to 0.8.2.1. >> >> Thanks, >> -Andrew >