My hypothesis for how Partition [luke3,3] with leader 11, had offset reset to zero, caused by reboot of leader broker during partition reassignment:
The replicas for [luke3,3] were in progress being reassigned from broker 10,11,12 -> 11,12,13 I rebooted broker 11 which was the leader for [luke3.3] Broker 12 and 13 logs indicate replica fetch failures from leader 11 due to connection time out Broker 10 attempts to become the leader for [luke3,3] but has an issue (I see a zk exception but I'm unsure what is happening) Broker 11 eventually comes online and attempts to fetch from new leader 10 Broker 11 completes fetch from leader 10 at offset 0 Broker 10 is leader but is serving a new data log and offset has been reset Remaining brokers truncate logs and follow broker 10 Gist of logs for brokers 13,11,12 that I think backs up this summary: https://gist.github.com/anonymous/cb79dc251d87e334cfff Thanks, Luke Forehand | Networked Insights | Software Engineer On 6/23/14, 5:57 PM, "Guozhang Wang" <wangg...@gmail.com> wrote: >Hi Luke, > >What are the exceptions/warnings you saw in the broker and controller >logs? > >Guozhang > > >On Mon, Jun 23, 2014 at 2:03 PM, Luke Forehand < >luke.foreh...@networkedinsights.com> wrote: > >> Hello, >> >> I am testing kafka 0.8.1.1 in preparation for an upgrade from >> kafka-0.8.1-beta. I have a 4 node cluster with one broker per node, >>and a >> topic with 8 partitions and 3 replicas. Each partition has about 6 >> million records. >> >> I generated a partition reassignment json that basically causes all >> partitions to be shifted by one broker. As the reassignment was in >> progress I bounced one of the servers. After the server came back up >>and >> the broker started, I waited for the server logs to stop complaining and >> then ran the reassignment verify script and all partitions were verified >> as completed reassignment. >> >> However, one of the partition offsets was reset to 0, and 4 out of 8 >> partitions only had 2 in-sync-replicas instead of 3 (in-sync came back >>to >> 3 but only after I again bounced the server I had previously bounced >> during reassignment). >> >> Is this considered a bug? I ask because we use the SimpleConsumer API >>so >> we keep track of our own offset "pointers". If it is not a bug then I >> could reset the pointer to "earliest" and continue reading, but I was >> wondering if there is a potential for data loss in my scenario. I have >> plenty of logs and can reproduce but before I spam I was wondering if >> there is already a jira task for this issue or if anybody else is aware. >> >> Thanks, >> Luke Forehand | Networked Insights | Software Engineer >> >> > > >-- >-- Guozhang