Controlled shutdown and leader election issues

Ryan Berdeen Thu, 20 Mar 2014 15:15:46 -0700

While upgrading from 0.8.0 to 0.8.1 in place, I observed some surprising
behavior using kafka.admin.ShutdownBroker. At the start, there were no
underreplicated partitions. After running


  bin/kafka-run-class.sh kafka.admin.ShutdownBroker --broker 10 ...

Partitions that had replicas on broker 10 were under-replicated:

  bin/kafka-topics.sh --describe --under-replicated-partitions ...
  Topic: analytics-activity Partition: 2  Leader: 12  Replicas: 12,10 Isr:
12
  Topic: analytics-activity Partition: 6  Leader: 11  Replicas: 11,10 Isr:
11
  Topic: analytics-activity Partition: 14 Leader: 14  Replicas: 14,10 Isr:
14
  ...

While restarting the broker process, many produce requests failed with
kafka.common.UnknownTopicOrPartitionException.

After each broker restart, I used the preferred leader election tool for
all topics. Now, after finishing all of the broker restarts, the cluster
seems to be stuck in leader election. Running the tool fails with
"kafka.admin.AdminOperationException: Preferred replica leader election
currently in progress..."

Are any of these known issues? Is there a safer way to shutdown and restart
brokers that does not cause producer failures and under-replicated
partitions?

Controlled shutdown and leader election issues

Reply via email to