Do you see constant ISR shrinking/expansion of those two partitions in the
leader broker's log ?

Thanks,

Jun


On Fri, May 16, 2014 at 4:25 PM, Paul Mackles <pmack...@adobe.com> wrote:

> Hi - We are running kafka_2.8.0-0.8.0-beta1 (we are a little behind in
> upgrading).
>
> From what I can tell, connectivity to ZK was lost for a brief period. The
> cluster seemed to recover OK except that we now have 2 (out of 125)
> partitions where the ISR appears to be out of date. In other words,
> kafka-list-topic is showing only one replica in the ISR for the 2
> partitions in question (there should be 3).
>
> What's odd is that in looking at the log segments for those partitions on
> the file system, I can see that they are in fact getting updated and by all
> measures look to be in sync. I can also see that the brokers where the
> out-of-sync replicas reside are doing fine and leading other partitions
> like nothing ever happened. Based on that, it seems like the ISR in ZK is
> just out-of-date due to a botched recovery from the brief ZK outage.
>
> Has anyone seen anything like this before? I saw this ticket which sounded
> similar:
>
> https://issues.apache.org/jira/browse/KAFKA-948
>
> Anyone have any suggestions for recovering from this state? I was thinking
> of running the preferred-replica-election tool next to see if that gets the
> ISRs in ZK back in sync.
>
> After that, I guess the next step would be to bounce the kafka servers in
> question.
>
> Thanks,
> Paul
>
>

Reply via email to