Rob,

It is inevitable that the repairs are needed to keep consistency guarantees. Is 
it worthwhile to consider RAID-0 as we get more storage? One can treat loss of 
disk as loss of node and rebuild the node and repair. Any other suggestions are 
most welcome.


-Sri
________________________________
From: Robert Coli <rc...@eventbrite.com>
Sent: Friday, April 10, 2015 6:51 PM
To: user@cassandra.apache.org
Subject: Re: Moving SSTables from one disk to another

On Fri, Apr 10, 2015 at 4:30 PM, Jonathan Haddad 
<j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote:
However, it was pointed out to me that
https://issues.apache.org/jira/browse/CASSANDRA-6696 will be a better
solution in a lot of cases.

Thank you for the interesting link about a theoretical usage which would make 
JBOD worth using.

But I really don't understand why we consider the use of the current JBOD ok, 
when :

"In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
empty one and repair is run. This can cause deleted data to come back in some 
cases."

This class of issue is permanently fatal to consistency for the affected data.

Why are we encouraging people to expose themselves to this class of issue? What 
benefit do they get from current JBOD implementation that is worth this risk to 
consistency?

Yes, it's true that if an operator in this case never creates tombstones or 
never runs repair after losing only one disk, they're not exposed to the risk. 
But when they configure JBOD, the entire point is that they hope to run repair 
after losing only one disk, instead of rebuilding the entire node. The status 
quo seems to set up operators for failure when they attempt to do what the 
feature claims to be useful for.

I don't get "features" like this : questionable benefit, measurable risk, known 
serious issues and yet they sit there in the product for years on end, daring 
someone to use them...

=Rob

Reply via email to