Do we have a jira to support removal of dead brokers without having to start a new broker with the same id?
I think its something we'll want to allow. On Thu, Oct 2, 2014 at 7:45 AM, Jun Rao <jun...@gmail.com> wrote: > The reassign partition process only completes after the new replicas are > fully caught up and the old replicas are deleted. So, if the old replica is > down, the process can never complete, which is what you observed. In your > case, if you just want to replace a broker host with a new one, instead of > using the reassign partition tool, simply start a new broker with the same > broker id as the old one, the new broker will replicate all the data > automatically. > > Thanks, > > Jun > > On Wed, Oct 1, 2014 at 3:43 PM, Lung, Paul <pl...@ebay.com> wrote: > >> Hi All, >> >> I had a 0.8.1.1 Kafka Broker go down, and I was trying to use the reassign >> partition script to move topics off that broker. When I describe the >> topics, I see the following: >> >> Topic: mini__022____active_120__33__mini Partition: 0 Leader: 2131118 >> Replicas: 2131118,2166601,2163421 Isr: 2131118,2166601 >> >> This shows that the broker “2163421” is down. So I create the following >> file /tmp/move_topic.json: >> { >> "version": 1, >> "partitions": [ >> { >> "topic": "mini__022____active_120__33__mini", >> "partition": 0, >> "replicas": [ >> 2131118, 2166601, 2156998 >> ] >> } >> ] >> } >> >> And then do this: >> >> ./kafka-reassign-partitions.sh --execute --reassignment-json-file >> /tmp/move_topic.json >> Successfully started reassignment of partitions >> {"version":1,"partitions":[{"topic":"mini__022____active_120__33__mini","partition":0,"replicas":[2131118,2166601,2156998]}]} >> >> However, when I try to verify this, I get the following error: >> ./kafka-reassign-partitions.sh --verify --reassignment-json-file >> /tmp/move_topic.json >> Status of partition reassignment: >> ERROR: Assigned replicas (2131118,2166601,2156998,2163421) don't match the >> list of replicas for reassignment (2131118,2166601,2156998) for partition >> [mini__022____active_120__33__mini,0] >> Reassignment of partition [mini__022____active_120__33__mini,0] failed >> >> If I describe the topics, I now see there are 4 replicas. This has been >> like this for many hours now, so it seems to have permanently moved to 4 >> replicas for some reason. >> Topic:mini__022____active_120__33__mini PartitionCount:1 >> ReplicationFactor:4 Configs: >> Topic: mini__022____active_120__33__mini Partition: 0 Leader: 2131118 >> Replicas: 2131118,2166601,2156998,2163421 Isr: 2131118,2166601 >> >> If I re-execute and re-verify, I get the same error. So it seems to be >> wedged. >> >> Can someone help? >> >> Paul Lung >> >> >>