Sounds like you are bumping into this
https://issues.apache.org/jira/browse/KAFKA-1367

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/

On Wed, Jan 21, 2015 at 10:10 AM, svante karlsson <s...@csi.se> wrote:

> We are running an external (like in non supported) C++ client library
> agains 0.8.2-rc2 and see differences in the Isr vector in Metadata Response
> compared to what ./kafka-topics.sh --describe returns.
>
> We have a triple replicated topic that is not updated during the test.
>
> kafka-topics.sh
> returns
>
>         Topic: saka.test.int_datastream Partition: 0    Leader: 3
> Replicas: 3,1,2 Isr: 2,1,3
>         Topic: saka.test.int_datastream Partition: 1    Leader: 1
> Replicas: 1,2,3 Isr: 2,1,3
>
>
> After some debugging of the received packet it seems the data is actually
> missing from the server.
>
> After a sequensial restart of each broker - everything was back to normal
>
> two pairs of loglines every 10s
>
> initial state:
>
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
>
> restart broker 1
>
> handle_connect_retry_timer
> _connect_async_next z8r102-mc12-4-4.sth-tc2.videoplaza.net:9092
>
> saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> ...
> saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 2 Replicas: 1, 2, 3, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 3, 1,
>
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 3,
>
> restart broker 3
>
> known brokers changed {....  }
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, Isr: 2, 1,
> saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 1, 2, Isr: 2, 1,
> known brokers changed { .... }
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 2 Replicas: 3, 1, 2, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 2,
> 1,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 2,
> 1, 3,
>
> restart broker 2
>
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 1 Leader: 1 Replicas: 1, 2, 3, Isr: 1,
> 3, 2,
> saka.test.int_datastream Partition: 0 Leader: 3 Replicas: 3, 1, 2, Isr: 1,
> 3, 2,
>
>
> all this time kafka-topics.sh returns (except for a very short time during
> the restart)
>
> Topic: saka.test.int_datastream Partition: 0    Leader: 3       Replicas:
> 3,1,2 Isr: 2,1,3
> Topic: saka.test.int_datastream Partition: 1    Leader: 1       Replicas:
> 1,2,3 Isr: 2,1,3
>
>
> This seems reproducible by shutting down all brokers at the same time. Then
> the isr vectors will never "heal". Bumping broker by broker heals them
> again.
>
> /svante
>
>
> /svante
>
> /svante
>

Reply via email to