Agreed, I think it is also a replacement scenario since we may want to bring in a new broker for the dead broker. We should support explicit remove also. We could add to the ReassignPartitionsCommand for an explicit "make new broker just like the old broker". This is important very much in cloud environments where the broker id is the ip address and instances come and go. Many scripts get written to do a bunch of steps when it could just be one. This makes automatically "in-servicing" a new broker and having it take up the balance of the work for the broker that it is replacing much simpler and straightforward. It also makes it more consistent across the community. I created a JIRA https://issues.apache.org/jira/browse/KAFKA-1678 for those just now.
On Mon, Oct 6, 2014 at 2:10 PM, Gwen Shapira <gshap...@cloudera.com> wrote: > Do we have a jira to support removal of dead brokers without having to > start a new broker with the same id? > > I think its something we'll want to allow. > > On Thu, Oct 2, 2014 at 7:45 AM, Jun Rao <jun...@gmail.com> wrote: > > The reassign partition process only completes after the new replicas are > > fully caught up and the old replicas are deleted. So, if the old replica > is > > down, the process can never complete, which is what you observed. In your > > case, if you just want to replace a broker host with a new one, instead > of > > using the reassign partition tool, simply start a new broker with the > same > > broker id as the old one, the new broker will replicate all the data > > automatically. > > > > Thanks, > > > > Jun > > > > On Wed, Oct 1, 2014 at 3:43 PM, Lung, Paul <pl...@ebay.com> wrote: > > > >> Hi All, > >> > >> I had a 0.8.1.1 Kafka Broker go down, and I was trying to use the > reassign > >> partition script to move topics off that broker. When I describe the > >> topics, I see the following: > >> > >> Topic: mini__022____active_120__33__mini Partition: 0 Leader: 2131118 > >> Replicas: 2131118,2166601,2163421 Isr: 2131118,2166601 > >> > >> This shows that the broker “2163421” is down. So I create the following > >> file /tmp/move_topic.json: > >> { > >> "version": 1, > >> "partitions": [ > >> { > >> "topic": "mini__022____active_120__33__mini", > >> "partition": 0, > >> "replicas": [ > >> 2131118, 2166601, 2156998 > >> ] > >> } > >> ] > >> } > >> > >> And then do this: > >> > >> ./kafka-reassign-partitions.sh --execute --reassignment-json-file > >> /tmp/move_topic.json > >> Successfully started reassignment of partitions > >> > {"version":1,"partitions":[{"topic":"mini__022____active_120__33__mini","partition":0,"replicas":[2131118,2166601,2156998]}]} > >> > >> However, when I try to verify this, I get the following error: > >> ./kafka-reassign-partitions.sh --verify --reassignment-json-file > >> /tmp/move_topic.json > >> Status of partition reassignment: > >> ERROR: Assigned replicas (2131118,2166601,2156998,2163421) don't match > the > >> list of replicas for reassignment (2131118,2166601,2156998) for > partition > >> [mini__022____active_120__33__mini,0] > >> Reassignment of partition [mini__022____active_120__33__mini,0] failed > >> > >> If I describe the topics, I now see there are 4 replicas. This has been > >> like this for many hours now, so it seems to have permanently moved to 4 > >> replicas for some reason. > >> Topic:mini__022____active_120__33__mini PartitionCount:1 > >> ReplicationFactor:4 Configs: > >> Topic: mini__022____active_120__33__mini Partition: 0 Leader: 2131118 > >> Replicas: 2131118,2166601,2156998,2163421 Isr: 2131118,2166601 > >> > >> If I re-execute and re-verify, I get the same error. So it seems to be > >> wedged. > >> > >> Can someone help? > >> > >> Paul Lung > >> > >> > >> >