While upgrading from 0.8.0 to 0.8.1 in place, I observed some surprising behavior using kafka.admin.ShutdownBroker. At the start, there were no underreplicated partitions. After running
bin/kafka-run-class.sh kafka.admin.ShutdownBroker --broker 10 ... Partitions that had replicas on broker 10 were under-replicated: bin/kafka-topics.sh --describe --under-replicated-partitions ... Topic: analytics-activity Partition: 2 Leader: 12 Replicas: 12,10 Isr: 12 Topic: analytics-activity Partition: 6 Leader: 11 Replicas: 11,10 Isr: 11 Topic: analytics-activity Partition: 14 Leader: 14 Replicas: 14,10 Isr: 14 ... While restarting the broker process, many produce requests failed with kafka.common.UnknownTopicOrPartitionException. After each broker restart, I used the preferred leader election tool for all topics. Now, after finishing all of the broker restarts, the cluster seems to be stuck in leader election. Running the tool fails with "kafka.admin.AdminOperationException: Preferred replica leader election currently in progress..." Are any of these known issues? Is there a safer way to shutdown and restart brokers that does not cause producer failures and under-replicated partitions?