Strange behavior during un-clean leader election

Bryan Baugher Mon, 20 Oct 2014 13:19:40 -0700

Hi everyone,

We run a 3 Kafka cluster using 0.8.1.1 with all topics having a replication
factor of 3 meaning every broker has a replica of every partition.


We recently ran into this issue (
https://issues.apache.org/jira/browse/KAFKA-1028) and saw data loss within
Kafka. We understand why it happened and have plans to try to ensure it
doesn't happen again.

The strange part was that the broker that was chosen for the un-clean
leader election seemed to drop all of its own data about the partition in
the process as our monitoring shows the broker offset was reset to 0 for a
number of partitions.

Following the broker's server logs in chronological order for a particular
partition that saw data loss I see this,

2014-10-16 10:18:11,104 INFO kafka.log.Log: Completed load of log TOPIC-6
with log end offset 528026

2014-10-16 10:20:18,144 WARN
kafka.controller.OfflinePartitionLeaderSelector:
[OfflinePartitionLeaderSelector]: No broker in ISR is alive for [TOPIC,6].
Elect leader 1 from live brokers 1,2. There's potential data loss.

2014-10-16 10:20:18,277 WARN kafka.cluster.Partition: Partition [TOPIC,6]
on broker 1: No checkpointed highwatermark is found for partition [TOPIC,6]

2014-10-16 10:20:18,698 INFO kafka.log.Log: Truncating log TOPIC-6 to
offset 0.

2014-10-16 10:21:18,788 INFO kafka.log.OffsetIndex: Deleting index
/storage/kafka/00/kafka_data/TOPIC-6/00000000000000528024.index.deleted

2014-10-16 10:21:18,781 INFO kafka.log.Log: Deleting segment 528024 from
log TOPIC-6.

I'm not too worried about this since I'm hoping to move to Kafka 0.8.2 ASAP
but I was curious if anyone could explain this behavior.

-Bryan

Strange behavior during un-clean leader election

Reply via email to