Hm, interesting. So my real issue is more with Camus than with cluster problems? It seems that Camus won’t consume if it encounters a ReplicaNotAvailableException.
> On Aug 15, 2015, at 12:02, Clark Haskins <cl...@kafka.guru> wrote: > > Replica not available is not a fatal exception. This simply means that there > is a replica that is down. > > If you get Leader not available that means the partition is offline. > > -Clark > > Sent from my iPhone > >> On Aug 15, 2015, at 8:41 AM, Andrew Otto <ao...@wikimedia.org> wrote: >> >> Also strange: If I start this broker back up, and then issue a kafkacat >> metadata request, I do not see any 'Broker: Replica not available’, even >> though this broker’s preferred partitions have not yet replicated back in >> sync, and are not the leader. Everything seems normal. >> >> Somehow this broker being offline makes the rest of the cluster think that >> its none of its replicas are available. >> >> >> >>> On Aug 15, 2015, at 11:18, Andrew Otto <ao...@wikimedia.org> wrote: >>> >>> I am having trouble with a single broker causing consumers to lag. As I am >>> troubleshooting this issue, I have stopped this broker in the hopes that >>> other replicas will take over as leader for this broker’s preferred >>> partitions. However, when I do so, Camus reports: >>> >>> kafka.CamusJob: Skipping the creation of ETL request for Topic : >>> webrequest_text and Partition : 3 Exception : >>> kafka.common.ReplicaNotAvailableException >>> >>> kafka-topics.sh —describe shows: >>> >>> Topic: webrequest_text Partition: 3 Leader: 22 Replicas: 22,21,12 >>> Isr: 22,21 >>> >>> However, when I use kafkacat to look at metadata (which asks for metadata >>> from Kafka rather than Zookeeper), I see: >>> >>> partition 3, leader 22, replicas: 22,21, isrs: 22,21, Broker: Replica not >>> available >>> >>> >>> Doh! Clearly there is a replica available. I can use kafkacat and >>> kafka-simple-consumer-shell to consume from this partition from either in >>> sync replica just fine. >>> >>> This happens for all partitions for whom the stopped broker was previously >>> the leader. >>> >>> Anyone know why I’d see something like this? I have not seen this error >>> before upgrading to 0.8.2.1. >>> >>> Thanks, >>> -Andrew >>