Why not use NetworkTopology and specify each region as a ‘DC’ ?
Setup a snitch (propertyFile or Gossip, or even the EC2Region one) to list out which nodes are in which DC. Then when creating the Keyspace, specify NetworkTopology, with RF1 in each DC / Rack. Ie. CREATE KEYSPACE fred WITH replication = {'class': 'NetworkTopologyStrategy', 'DC2': '1', 'DC3': '1', 'DC1': '1'}; Regards Mark Farnan From: William Oberman [mailto:ober...@civicscience.com] Sent: Tuesday, May 13, 2014 11:11 PM To: user@cassandra.apache.org Subject: NTS, vnodes and 0% chance of data loss I found this: http://mail-archives.apache.org/mod_mbox/cassandra-user/201404.mbox/%3ccaeduwd1erq-1m-kfj6ubzsbeser8dwh+g-kgdpstnbgqsqc...@mail.gmail.com%3E I read the three referenced cases. In addition, case 4123 references: http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html And even though I *think* I understand all of the issues now, I still want to double check... Assumptions: -A cluster using NTS with options [DC:3] -Physical layout = In DC, 3 nodes/rack for a total of 9 nodes No vnodes: I could do token selection using ideas from case 3810 such that each rack has one replica. At this point, my "0% chance of data loss" scenarios are: 1.) Failure of two nodes at random 2.) Failure of 2 racks (6 nodes!) Vnodes: my "0% chance of data loss" scenarios are: 1.) Failure of two nodes at random Which means a rack failure (3 nodes) has a non-zero chance of data failure (right?). To get specific, I'm in AWS, so racks ~= "availability zones". In the years I've been in AWS, I've seen several occasions of "single zone downtimes", and one time of "single zone catastrophic loss". E.g. for AWS I feel like you *have* to plan for a single zone failure, and in terms of "safety first" you *should* plan for two zone failures. To mitigate this data loss risk seems rough for vnodes, again if I'm understanding everything correctly: -To ensure 0% data loss for one zone => I need RF=4 -To ensure 0% data loss for two zones => I need RF=7 I'd really like to use vnodes, but RF=7 is crazy. To reiterate what I think is the core idea of this message: 1.) for vnodes 0% data loss => RF=(# of allowed failures at once)+1 2.) racks don't change the above equation at all will