Small correction: The token range for each node is (Previous_token, My_Token]. ( means exclusive and ] means inclusive. So N1 is responsible from X+1 to A in following case.
maki 2012/1/11 Roland Gude <roland.g...@yoochoose.com>: > > > Each node in the cluster is assigned a token (can be done automatically – > but usually should not) > > The token of a node is the start token of the partition it is responsible > for (and the token of the next node is the end token of the current tokens > partition) > > > > Assume you have the following nodes/tokens (which are usually numbers but > for the example I will use letters) > > > > N1/A > > N2/D > > N3/M > > N4/X > > > > This means that N1 is responsible (primary) for [A-D) > > N2 for [D-M) > > N3 for [M-X) > > And N4 for [X-A) > > > > If you have a replication factor of 1 data will go on the nodes like this: > > > > B -> N1 > > E->N2 > > X->N4 > > > > And so on > > If you have a higher replication factor, the placement strategy decides > which node will take replicas of which partition (becoming secondary node > for that partition) > > Simple strategy will just put the replica on the next node in the ring > > So same example as above but RF of 2 and simple strategy: > > > > B-> N1 and N2 > > E -> N2 and N3 > > X -> N4 and N1 > > > > Other strategies can factor in things like “put data in another datacenter” > or “put data in another rack” or such things. > > > > Even though the terms primary and secondary imply some means of quality or > consistency, this is not the case. If a node is responsible for a piece of > data, it will store it. > > > > > > But placement of the replicas is usually only relevant for availability > reasons (i.e. disaster recovery etc.) > > Actual location should mean nothing to most applications as you can ask any > node for the data you want and it will provide it to you (fetching it from > the responsible nodes). > > This should be sufficient in almost all cases. > > > > So in the above example again, you can ask N3 “what data is available” and > it will tell you: B, E and X, or you could ask it “give me X” and it will > fetch it from N4 or N1 or both of them depending on consistency > configuration and return the data to you. > > > > > > So actually if you use Cassandra – for the application the actual storage > location of the data should not matter. It will be available anywhere in the > cluster if it is stored on any reachable node. > > > > Von: Andreas Rudolph [mailto:andreas.rudo...@spontech-spine.com] > Gesendet: Dienstag, 10. Januar 2012 15:06 > An: user@cassandra.apache.org > Betreff: Re: AW: How to control location of data? > > > > Hi! > > > > Thank you for your last reply. I'm still wondering if I got you right... > > > > ... > > A partitioner decides into which partition a piece of data belongs > > Does your statement imply that the partitioner does not take any decisions > at all on the (physical) storage location? Or put another way: What do you > mean with "partition"? > > > > To quote http://wiki.apache.org/cassandra/ArchitectureInternals: > "... AbstractReplicationStrategy controls what nodes get secondary, > tertiary, etc. replicas of each key range. Primary replica is always > determined by the token ring (...)" > > > > ... > > You can select different placement strategies and partitioners for different > keyspaces, thereby choosing known data to be stored on known hosts. > > This is however discouraged for various reasons – i.e. you need a lot of > knowledge about your data to keep the cluster balanced. What is your usecase > for this requirement? there is probably a more suitable solution. > > > > What we want is to partition the cluster with respect to key spaces. > > That is we want to establish an association between nodes and key spaces so > that a node of the cluster holds data from a key space if and only if that > node is a *member* of that key space. > > > > To our knowledge Cassandra has no built-in way to specify such a > membership-relation. Therefore we thought of implementing our own replica > placement strategy until we started to assume that the partitioner had to be > replaced, too, to accomplish the task. > > > > Do you have any ideas? > > > > > > Von: Andreas Rudolph [mailto:andreas.rudo...@spontech-spine.com] > Gesendet: Dienstag, 10. Januar 2012 09:53 > An: user@cassandra.apache.org > Betreff: How to control location of data? > > > > Hi! > > > > We're evaluating Cassandra for our storage needs. One of the key benefits we > see is the online replication of the data, that is an easy way to share data > across nodes. But we have the need to precisely control on what node group > specific parts of a key space (columns/column families) are stored on. Now > we're having trouble understanding the documentation. Could anyone help us > with to find some answers to our questions? > > · What does the term "replica" mean: If a key is stored on exactly three > nodes in a cluster, is it correct then to say that there are three replicas > of that key or are there just two replicas (copies) and one original? > > · What is the relation between the Cassandra concepts "Partitioner" and > "Replica Placement Strategy"? According to documentation found on DataStax > web site and architecture internals from the Cassandra Wiki the first > storage location of a key (and its associated data) is determined by the > "Partitioner" whereas additional storage locations are defined by "Replica > Placement Strategy". I'm wondering if I could completely redefine the way > how nodes are selected to store a key by just implementing my own subclass > of AbstractReplicationStrategy and configuring that subclass into the key > space. > > · How can I suppress that the "Partitioner" is consulted at all to > determine what node stores a key first? > > · Is a key space always distributed across the whole cluster? Is it > possible to configure Cassandra in such a way that more or less freely > chosen parts of a key space (columns) are stored on arbitrarily chosen > nodes? > > > > Any tips would be very appreciated :-) > > -- w3m