It looks like broker 5 is in a bad state. You are likely going to have to shut it down. From there you have a few options and depending on your environment setup will dictate if you do shut it down and/or what you do after that. Spinning up another server with broker.id == 5 and let replication heal the topics that were durable is a way to go. If you do that then you can go back to the old server and debug what went wrong and recover the replication factor == 1 partition data (back it up) and fix that later after you figure out what went wrong.
/******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> ********************************************/ On Tue, Dec 9, 2014 at 2:54 AM, ashendra bansal <ashendraban...@gmail.com> wrote: > Hi, > > One of the broker seems to have got corrupted in my cluster of 7 > brokers. All the topic partitions where this broker was leader are having > NoLeader or UnderReplicated partition exceptions. > > All these partittions have no leader and even no replica in the isr(in-sync > replica) set. > > Corrupt broker id - 5. > > topic: topic1 partition: 2 leader: -1 replicas: 5 isr: > topic: topic1 partition: 8 leader: -1 replicas: 5 isr: > topic: topic1 partition: 14 leader: -1 replicas: 5 isr: > topic: topic2 partition: 1 leader: -1 replicas: 5 isr: > topic: topic2 partition: 8 leader: -1 replicas: 5 isr: > topic: topic2 partition: 15 leader: -1 replicas: 5 isr: > topic: topic3 partition: 1 leader: -1 replicas: 5 isr: > topic: topic3 partition: 8 leader: -1 replicas: 5 isr: > topic: topic3 partition: 15 leader: -1 replicas: 5 isr: > > I have tried the replication tools to manually assign broker to these > partitions but that did not helped. As none of them are in isr set. > > Unfortunately the replication factor for these topics was 1. But for topics > where the replication factor was higher, the problem persist. There the > leader has been assigned to the next preferred replica but the replica on > corrupt broker is not moved to isr set even after long time(days) and > partitions have logs in order of 100s. > > topic: topic4 partition: 1 leader: 6 replicas: 5,6 isr: 6 > > For same topic, the partition where leader was not broker 5(corrupted > broker) there broker 5 is still in isr set. > > topic: topic4 partition: 0 leader: 4 replicas: 4,5 isr: 4,5 > > Another observation, the corrupted broker has topic creation log in its > INFO logs, printed very frequently, every minute > > [2014-12-09 13:07:27,878] INFO Topic creation { "partitions":{ "0":[ 4, 3 > ], "1":[ 5, 4 ] }, "version":1 } (kafka.admin.AdminUtils$) > > Though there are no topics created on the cluster. > > Has anyone faced a similar problem. How can I fix it. > > Ashendra >