SimpleStrategy doesn’t take DC or rack into account at all. It simply places replicas on subsequent tokens. You could end up with 3 copies in 1 DC and zero in another.
/** * This class returns the nodes responsible for a given * key but does not respect rack awareness. Basically * returns the RF nodes that lie right next to each other * on the ring. */ public List<InetAddress> calculateNaturalEndpoints(Token token, TokenMetadata metadata) { int replicas = getReplicationFactor(); ArrayList<Token> tokens = metadata.sortedTokens(); List<InetAddress> endpoints = new ArrayList<InetAddress>(replicas); if (tokens.isEmpty()) return endpoints; // Add the token at the index by default Iterator<Token> iter = TokenMetadata.ringIterator(tokens, token, false); while (endpoints.size() < replicas && iter.hasNext()) { InetAddress ep = metadata.getEndpoint(iter.next()); if (!endpoints.contains(ep)) endpoints.add(ep); } return endpoints; } NTS keeps track of the replicas per-DC. The code is here, but it’s a bit longer so I’ll just link to it. https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L146 <https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java#L146> Jon > On Jul 21, 2017, at 11:31 AM, Eric Stevens <migh...@gmail.com> wrote: > > > If using the SimpleStrategy replication class, it appears that > > replication_factor is the only option, which applies to the entire > > cluster, so only one node in both datacenters would have the data. > > This runs counter to my understanding, or else I'm not reading your statement > correctly. When a user chooses SimpleStrategy they're saying that the same > replication applies to _each DC_ in your cluster, not that _all DC's_ > contribute to the total replication. > > Put another way, my understanding is that if you have SimpleStrategy RF=1 > with two data centers, you have two copies of each piece of data - one in > each DC. > > On Thu, Jul 20, 2017 at 2:56 PM Michael Shuler <mich...@pbandjelly.org > <mailto:mich...@pbandjelly.org>> wrote: > Datacenter replication is defined in the keyspace schema, so I believe that > ... > WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 1, > 'DC2': 1} > ... > you ought to be able to repair DC1 from DC2, once you have the DC1 node > healthy again. > > If using the SimpleStrategy replication class, it appears that > replication_factor is the only option, which applies to the entire > cluster, so only one node in both datacenters would have the data. > > https://cassandra.apache.org/doc/latest/cql/ddl.html#create-keyspace > <https://cassandra.apache.org/doc/latest/cql/ddl.html#create-keyspace> > > -- > Kind regards, > Michael > > On 07/20/2017 03:23 PM, Roger Warner wrote: > > Hi > > > > > > > > I’m a little dim on what multi datacenter implies in the 1 replica > > case. I know about replica recovery, how about “node recovery” > > > > > > > > As I understand if there a node failure or disk crash with a single node > > cluster with replication factor 1 I lose data. Easy. > > > > > > > > nodetool tells me each node in my 3 node X 2 datacenters is responsible > > for ~1/3 of the data. If in this cluster with RF=1 a node fails in > > dc1 what happens ? in 1 dc with data loss can the node be “restored” > > from a node in dc2?. Automatically? > > > > > > > > I’m also asking tangentially how does the data map from nodes in dc1 to > > dc2. > > > > > > > > I hope I made that coherent. > > > > > > > > Roger > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org > <mailto:user-unsubscr...@cassandra.apache.org> > For additional commands, e-mail: user-h...@cassandra.apache.org > <mailto:user-h...@cassandra.apache.org> >