Thank you for the answers. We are using the current version 3.11.3 So this one includes CASSANDRA-6696. So if I get this right, losing system tables will need a full node rebuild. Otherwise repair will get the node consistent again.
Regards, Christian Von: kurt greaves <k...@instaclustr.com> Antworten an: "user@cassandra.apache.org" <user@cassandra.apache.org> Datum: Mittwoch, 15. August 2018 um 04:53 An: User <user@cassandra.apache.org> Betreff: Re: JBOD disk failure If that disk had important data in the system tables however you might have some trouble and need to replace the entire instance anyway. On 15 August 2018 at 12:20, Jeff Jirsa <jji...@gmail.com<mailto:jji...@gmail.com>> wrote: Depends on version For versions without the fix from Cassandra-6696, the only safe option on single disk failure is to stop and replace the whole instance - this is important because in older versions of Cassandra, you could have data in one sstable, a tombstone shadowing it in another disk, and it could be very far behind gc_grace_seconds. On disk failure in this scenario, if the disk holding the tombstone is lost, repair will propagate the (deleted/resurrected) data to the other replicas, which probably isn’t what you want to happen. With 6696, you should be safe to replace the disk and run repair - 6696 will keep data for a given token range all on the same disks, so the resurrection problem is solved. -- Jeff Jirsa On Aug 14, 2018, at 6:10 AM, Christian Lorenz <christian.lor...@webtrekk.com<mailto:christian.lor...@webtrekk.com>> wrote: Hi, given a cluster with RF=3 and CL=LOCAL_ONE and application is deleting data, what happens if the nodes are setup with JBOD and one disk fails? Do I get consistent results while the broken drive is replaced and a nodetool repair is running on the node with the replaced drive? Kind regards, Christian