num_tokens is the number of tokens per node, not per cluster. On Thu, Jan 7, 2016 at 10:09 PM Alec Collier <alec.coll...@macquarie.com> wrote:
> Have a look at this: > > http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 > > > > The vnodes mechanism is there to provide better scalability as new nodes > are added/removed, by allowing a single node to own several small chunks of > the token range. > > > > Aside from that, the process is exactly the same as in the single node > case, the coordinator calculates the token based on partition key and > locates the responsible node in the same way. SSTables are located on the > node’s disk per Cassandra table, no reference to vnodes at all. The term > virtual nodes is a bit misleading in that sense. > > > > Actually, Cassandra does have a total number of vnodes per cluster. Its > set with the num_tokens parameter in the Cassandra.yaml. > > > > Alec > > > > *From:* Sergi Vladykin [mailto:sergi.vlady...@gmail.com] > *Sent:* Friday, 25 December 2015 8:31 AM > *To:* user@cassandra.apache.org > *Subject:* Re: Data rebalancing algorithm > > > > Thanks a lot for your answers! > > Paulo, I'll take a look at classes you've suggested. > > Jack, the link you've provided lacks description on how virtual nodes are > mapped to phisical sstables/indexes on disk. > > To be more exact, I have the following better detailed questions: > > > > 1. How vnodes are mapped to sstables and indexes? Is one vnode a separate > part of the sstable or all the data from all vnodes just mixed in SSTable > or may be something else? > > > > 2. As far as I see Cassandra does not have predefined constant total > number of vnodes for the whole cluster, right? Does it mean that on > rebalancing some parts of data already mapped to some vnodes will be > remapped to new vnodes on the new node? > > 3. How long can take the rebalancing if we have lets say 1TB of data on a > single node and we are adding one more node to the cluster? > > > > Sergi > > > > > > 2015-12-24 19:26 GMT+03:00 Jack Krupansky <jack.krupan...@gmail.com>: > > Read details here: > > > https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html > > > > > -- Jack Krupansky > > > > On Thu, Dec 24, 2015 at 11:09 AM, Paulo Motta <pauloricard...@gmail.com> > wrote: > > The new node will own some parts (ranges) of the ring according to the > ring tokens the node is responsible for. These tokens are defined from the > yaml property initial_token (manual assignment) or num_tokens (random > assignment). > > During the bootstrap process raw data from sstables sections containing > the ranges the node is responsible for are transferred from nodes that > previously owned the range to the new node so the source sstables are > rebuilt in the joining node. After each sstable is transferred the new node > it rebuilds primary and secondary indexes, bloom filters, etc and in the > end of the bootstrap process the new sstables are added to the live data > set. > > See org.apache.cassandra.dht.BootStrapper.java and > org.apache.cassandra.streaming.StreamReceiveTask of the trunk branch for > more information. > > ps: I don't particularly recall any document with specific details, so if > anyone knows please be welcome to share. If you want more theoretical > information, see the ring membership sections of the cassandra and/or > dynamo paper. > > > > > > 2015-12-24 13:14 GMT-02:00 Sergi Vladykin <sergi.vlady...@gmail.com>: > > Guys, > > I was not able to find in docs or in google detailed description of data > rebalancing algorithm. > > I mean how Cassandra moves SSTables when new node connects to the cluster, > how > > primary and secondary indexes are getting transfered to this new node, > etc.. > > Can anyone provide relevant links please or just reply here? > > I can read source code of course, but it would be nice if someone could > answer right away :) > > > > Sergi > > > > > > > > This email, including any attachments, is confidential. If you are not the > intended recipient, you must not disclose, distribute or use the > information in this email in any way. If you received this email in error, > please notify the sender immediately by return email and delete the > message. Unless expressly stated otherwise, the information in this email > should not be regarded as an offer to sell or as a solicitation of an offer > to buy any financial product or service, an official confirmation of any > transaction, or as an official statement of the entity sending this > message. Neither Macquarie Group Limited, nor any of its subsidiaries, > guarantee the integrity of any emails or attached files and are not > responsible for any changes made to them by any other person. >