Re: JBOD disk failure

Christian Lorenz Wed, 15 Aug 2018 00:55:54 -0700

Thank you for the answers. We are using the current version 3.11.3 So this one 
includes CASSANDRA-6696.
So if I get this right, losing system tables will need a full node rebuild. 
Otherwise repair will get the node consistent again.


Regards,
Christian


Von: kurt greaves <k...@instaclustr.com>
Antworten an: "user@cassandra.apache.org" <user@cassandra.apache.org>
Datum: Mittwoch, 15. August 2018 um 04:53
An: User <user@cassandra.apache.org>
Betreff: Re: JBOD disk failure

If that disk had important data in the system tables however you might have 
some trouble and need to replace the entire instance anyway.

On 15 August 2018 at 12:20, Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
Depends on version

For versions without the fix from Cassandra-6696, the only safe option on 
single disk failure is to stop and replace the whole instance - this is 
important because in older versions of Cassandra, you could have data in one 
sstable, a tombstone shadowing it in another disk, and it could be very far 
behind gc_grace_seconds. On disk failure in this scenario, if the disk holding 
the tombstone is lost, repair will propagate the (deleted/resurrected) data to 
the other replicas, which probably isn’t what you want to happen.

With 6696, you should be safe to replace the disk and run repair - 6696 will 
keep data for a given token range all on the same disks, so the resurrection 
problem is solved.


--
Jeff Jirsa


On Aug 14, 2018, at 6:10 AM, Christian Lorenz 
<christian.lor...@webtrekk.com<mailto:christian.lor...@webtrekk.com>> wrote:
Hi,

given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting data, 
what happens if the nodes are setup with JBOD and one disk fails? Do I get 
consistent results while the broken drive is replaced and a nodetool repair is 
running on the node with the replaced drive?

Kind regards,
Christian

Re: JBOD disk failure

Reply via email to