> I was/am under the impression that a node owns a particular token
> range, and does not save any data that falls outside of that range
> (with exception to any data that might be replicated to it). Based on
> what you are saying, each node owns a token range, but also maintains
> copies of data outside of the range. If this is correct, then I can
> understand how all of my previous questions seemed "wrong." Cassandra
> already does what I want, provided that I use the correct RF and CL
> values.

No. I am not entirely sure from where the confusion comes, so I will
just try to summarize things from scratch in a brief manner.

Any piece of data you store in Cassandra is going to be in a
particular row, which has a row key.

That row will have a "replica set" in the Cassandra cluster. For RF=3,
that replica set contains three nodes. The replicate set is the set of
nodes that are responsible for keeping data for a row.

In other words, with RF=3, thus a replica set containing 3 nodes for
each possible row key, there will be 3 copies of the data in total.

All the consistency levels always refer to nodes *in the replica set*.
For example, CL.ALL requires that all nodes *in the replica set*
respond. CL.QUORUM requires that a majority of all nodes *in the
replica set* respond.

>From the perspective of a given node in the cluster, assuming for the
example RF=3, it will contain data for its own token range as well as
data for two other token ranges.

To re-iterate another point: The choice of consistency level *never*
affects *which* nodes are responsible for a given row key, nor does it
affect which rows will eventually receive writes. It *only* affects
how many nodes must respond before the operation (read or write) is
considered successful.

Does that make it clearer?

-- 
/ Peter Schuller (@scode on twitter)

Reply via email to