* Scenario 1: BrokerID 1,2,3 Broker 2 dies. Here, you can use reassign partitions tool and for all partitions that had a replica on broker 2, move it to broker 4
* Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still there. There is no way to recover any data here since there is nothing available to consume data from. Thanks, Neha On Fri, Mar 22, 2013 at 10:46 AM, Scott Clasen <sc...@heroku.com> wrote: > What would the recommended practice be for the following scenarios? > > Running on EC2, ephemperal disks only for kafka. > > There are 3 kafka servers. The broker ids are always increasing. If a > broker dies its never coming back. > > All topics have a replication factor of 3. > > * Scenario 1: BrokerID 1,2,3 Broker 2 dies. > > Recover by: > > Boot another: BrokerID 4 > ?? run bin/kafka-reassign-partitions.sh for any topic+partition and > replace brokerid 2 with brokerid 4 > ?? anything else to do to cause messages to be replicated to 4?? > > NOTE: This appears to work but not positive 4 got messages replicated to it. > > * Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still > there. > > Messages obviously lost. > Recover to a functional state by: > > Boot 3 more: 4,5 6 > ?? run bin/kafka-reassign-partitions.sh for all topics/partitions, swap > 1,2,3 for 4,5,6? > ?? rin bin/kafka-preferred-replica-election.sh for all topics/partitions > ?? anything else to do to allow producers to start sending successfully?? > > > NOTE: I had some trouble with scenario 2. Will try to reproduce and open a > ticket, if in fact my procedures for scenario 2 are correct, and I still > cant get to a good state.