Hm, interesting.  So my real issue is more with Camus than with cluster 
problems?  It seems that Camus won’t consume if it encounters a 
ReplicaNotAvailableException.


> On Aug 15, 2015, at 12:02, Clark Haskins <cl...@kafka.guru> wrote:
> 
> Replica not available is not a fatal exception. This simply means that there 
> is a replica that is down.
> 
> If you get Leader not available that means the partition is offline.
> 
> -Clark
> 
> Sent from my iPhone
> 
>> On Aug 15, 2015, at 8:41 AM, Andrew Otto <ao...@wikimedia.org> wrote:
>> 
>> Also strange:  If I start this broker back up, and then issue a kafkacat 
>> metadata request, I do not see any 'Broker: Replica not available’, even 
>> though this broker’s preferred partitions have not yet replicated back in 
>> sync, and are not the leader.  Everything seems normal.
>> 
>> Somehow this broker being offline makes the rest of the cluster think that 
>> its none of its replicas are available.
>> 
>> 
>> 
>>> On Aug 15, 2015, at 11:18, Andrew Otto <ao...@wikimedia.org> wrote:
>>> 
>>> I am having trouble with a single broker causing consumers to lag.  As I am 
>>> troubleshooting this issue, I have stopped this broker in the hopes that 
>>> other replicas will take over as leader for this broker’s preferred 
>>> partitions.  However, when I do so, Camus reports:
>>> 
>>> kafka.CamusJob: Skipping the creation of ETL request for Topic : 
>>> webrequest_text and Partition : 3 Exception : 
>>> kafka.common.ReplicaNotAvailableException
>>> 
>>> kafka-topics.sh —describe shows:
>>> 
>>> Topic: webrequest_text    Partition: 3    Leader: 22    Replicas: 22,21,12  
>>>   Isr: 22,21
>>> 
>>> However, when I use kafkacat to look at metadata (which asks for metadata 
>>> from Kafka rather than Zookeeper), I see:
>>> 
>>> partition 3, leader 22, replicas: 22,21, isrs: 22,21, Broker: Replica not 
>>> available
>>> 
>>> 
>>> Doh!  Clearly there is a replica available.  I can use kafkacat and 
>>> kafka-simple-consumer-shell to consume from this partition from either in 
>>> sync replica just fine.
>>> 
>>> This happens for all partitions for whom the stopped broker was previously 
>>> the leader.
>>> 
>>> Anyone know why I’d see something like this?  I have not seen this error 
>>> before upgrading to 0.8.2.1.
>>> 
>>> Thanks,
>>> -Andrew
>> 

Reply via email to