One of the brokers in our cluster had an unclean shutdown and after it was restated I found the following logs.
$ grep "clean shutdown" /var/groupon/kafka/kafka-broker.log 02/Sep/2015 16:19:23 - warn::[Kafka Server 1], Proceeding to do an unclean shutdown as all the controlled shutdown attempts failed 02/Sep/2015 16:22:06 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol1' 02/Sep/2015 16:22:11 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol2' 02/Sep/2015 16:22:15 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol3' 02/Sep/2015 16:22:18 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol4' 02/Sep/2015 16:22:22 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol5' 02/Sep/2015 16:22:22 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol6' 02/Sep/2015 16:22:26 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol7' 02/Sep/2015 16:22:29 - info::Found clean shutdown file. Skipping recovery for all logs in data directory '/data/vol8' So no recovery happened and the partitions managed by this broker are not catching up with the other replicas. I found that the ReplicaFetcher threads for each of the partitions died. Is anyone aware of how to get out of this situation. I was trying to locate the shutdown file (may be it was left over from a previous run) and delete it. Additional Information ~~~~~~~~~~~~~~~~~ Kafka v 0.8.1.1 Centos 5 11 node cluster with replication factor 3 Disks are JBODs Thanks, Pradeep