[ https://issues.apache.org/jira/browse/KAFKA-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729413#comment-14729413 ]
Gwen Shapira commented on KAFKA-2510: ------------------------------------- Note that if you accidentally manage to set an entire cluster to the wrong directory (easy when Chef or similar manages your configuration), you also lose the consumer offsets - so only clients that use external offset store will even notice that the data is gone. Losing ALL data in the cluster without any errors is a huge problem. > Prevent broker from re-replicating / losing data due to disk misconfiguration > ----------------------------------------------------------------------------- > > Key: KAFKA-2510 > URL: https://issues.apache.org/jira/browse/KAFKA-2510 > Project: Kafka > Issue Type: Bug > Reporter: Gwen Shapira > > Currently Kafka assumes that whatever it sees in the data directory is the > correct state of the data. > This means that if an admin mistakenly configures Chef to use wrong data > directory, one of the following can happen: > 1. The broker will replicate a bunch of partitions and take over the network > 2. If you did this to enough brokers, you can lose entire topics and > partitions. > We have information about existing topics, partitions and their ISR in > zookeeper. > We need a mode in which if a broker starts, is in ISR for a partition and > doesn't have any data or directory for the partition, the broker will issue a > huge ERROR in the log and refuse to do anything for the partition. > [~fpj] worked on the problem for ZK and had some ideas on what is required > here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)