On Mon, May 17, 2010 at 10:39 PM, Ma Xiao<max...@robowork.com> wrote:
Hi all,
Recently we have a 5 nodes running cassandra, 4 X 1.5TB drives for each,
...then I put 4 paths with DataFileDirectory,
my question is what's going to happen when one of the disk fail, especialy
the one has os installed which also holds the commit log? can we simply
replace the disk, and cassandra will get the replica write back? or it
should be deal with as entire node faile(wipe all date on the node and
rejoin the ring), that will involve mass data copy from other node but we
have just one disk fail. Note, we dont have hardware level raid installed.
Any suggestion?
On 5/18/10 11:23 PM, Jonathan Ellis wrote:
> Yes, you can rely on replication for this (run nodetool repair).
@Ma Xiao :
Nodes which have lost data due to hardware failure and are "repair"ing
can serve "no value" reads for data they think they have but do not
actually currently have, when read with ConsistencyLevel.ONE. Each one
of these no value reads triggers a read repair, however, which should
result in your hottest data being repaired relatively quickly.
As I understand it, using your 750gb of data case as an example, you are
likely to serve a non-trivial number of these no value reads if you read
with CL.ONE in a disk failure-then-repair scenario. Reading with
CL.QUORUM avoids this risk.
=Rob