Two out of  three of our Kafka nodes have become unrecoverable due to disk
corruption. I launched two new nodes, but they got new broker_id's.
For redistributing the topics across the cluster, I ran the command:
---
/opt/kafka/bin/kafka-reassign-partitions.sh --broker-list
"1003,1005,1006,1007" --execute --zookeeper
zookeeper.service.consul:2181/kafka --reassignment-json-file
finalreassing.json
---

Most of the topics were reassigned correctly in half hour, however 11 of
them were in progress for more than 72 hours and also:

1) they had more replicas than assigned
2) their replicas are assigned to the brokers that are dead.

For an example, consider topic: www

   - The initial replica assignment before disaster: 1001,1002,1003
   - We lose 1001, 1002 brokers permanently. There's no going back now
   - I run the kafka-reassign-paritions.sh --zookeeper zookeeper:2181/kafka
   --broker-list "1003,1005,1006" --execute --reassignment-json-file
   reassign.json
   - Contents of the json file are:
      - ,{"topic":"www","partition":0,"replicas":[1003,1005,1006]}
      - The process of reassigning starts, but never ends

When I do kafka-topics.sh --describe ... Then the assignment is not
updated.
The reassignment never completed, and I had to kill it by deleting the ZK
node /kafka/admin/reassign-partitions

I also tried removing the replica assignment using reassignment script with
the json:
{"topic":"www","partition":0,"replicas":[1003]}
But that doesn't seem to work either.

As a final resort, I also updated the topic config on the zk node
/kafka/broker/topics/www, The config is updated but kafka instantly reports
that the replica are caught up. There was around 20gigs of data in the
topic.

Bottomline, I am not able to assign replicas for the topics that are
assigned to dead brokers. What could be a workaround to do this without
losing data?

Reply via email to