[ https://issues.apache.org/jira/browse/KAFKA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gokul updated KAFKA-4124: ------------------------- Description: Currently when a disk goes down, the broker also goes down with it. This causes too much reshuffle of data over the network to replace the broker. Make the broker resilient to disk failure. The broker can detect a disk failure, mark it bad and then re-replicate the under replicated data in all other available disks in the node. If the bad disk is replaced with new one, the broker can rebalance the data among all other disks it has. The broker can also tolerate upto n disk failures. was: Currently when a disk goes down, the broker also goes down with it. Make the broker resilient to disk failure. The broker can detect a disk failure, mark it bad and then re-replicate the under replicated data in all other available disks in the node. If the bad disk is replaced with new one, the broker can rebalance the data among all other disks it has. > Handle disk failures gracefully > ------------------------------- > > Key: KAFKA-4124 > URL: https://issues.apache.org/jira/browse/KAFKA-4124 > Project: Kafka > Issue Type: Improvement > Reporter: Gokul > > Currently when a disk goes down, the broker also goes down with it. This > causes too much reshuffle of data over the network to replace the broker. > Make the broker resilient to disk failure. > The broker can detect a disk failure, mark it bad and then re-replicate the > under replicated data in all other available disks in the node. If the bad > disk is replaced with new one, the broker can rebalance the data among all > other disks it has. The broker can also tolerate upto n disk failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)