Rob,
It is inevitable that the repairs are needed to keep consistency guarantees. Is it worthwhile to consider RAID-0 as we get more storage? One can treat loss of disk as loss of node and rebuild the node and repair. Any other suggestions are most welcome. -Sri ________________________________ From: Robert Coli <rc...@eventbrite.com> Sent: Friday, April 10, 2015 6:51 PM To: user@cassandra.apache.org Subject: Re: Moving SSTables from one disk to another On Fri, Apr 10, 2015 at 4:30 PM, Jonathan Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote: However, it was pointed out to me that https://issues.apache.org/jira/browse/CASSANDRA-6696 will be a better solution in a lot of cases. Thank you for the interesting link about a theoretical usage which would make JBOD worth using. But I really don't understand why we consider the use of the current JBOD ok, when : "In JBOD, when someone gets a bad drive, the bad drive is replaced with a new empty one and repair is run. This can cause deleted data to come back in some cases." This class of issue is permanently fatal to consistency for the affected data. Why are we encouraging people to expose themselves to this class of issue? What benefit do they get from current JBOD implementation that is worth this risk to consistency? Yes, it's true that if an operator in this case never creates tombstones or never runs repair after losing only one disk, they're not exposed to the risk. But when they configure JBOD, the entire point is that they hope to run repair after losing only one disk, instead of rebuilding the entire node. The status quo seems to set up operators for failure when they attempt to do what the feature claims to be useful for. I don't get "features" like this : questionable benefit, measurable risk, known serious issues and yet they sit there in the product for years on end, daring someone to use them... =Rob