On Tue, Oct 26, 2010 at 14:56, Edward Capriolo <edlinuxg...@gmail.com> wrote: > On Tue, Oct 26, 2010 at 1:45 PM, Stu Hood <stu.h...@rackspace.com> wrote: >> While the "adding virtual tokens/nodes to Cassandra" discussion is a good >> one, there are a few factors that might delay (or remove?) the necessity of >> adding that complexity: >> >> * In Cassandra 0.7, removing load from a node is fairly cheap: a bounded >> number of reads are used to determine which portions of the large sorted >> data files (sstables) to stream, followed by "sendfile" calls to deliver the >> data to the destination >> * For a replication factor RF, RF nodes can send data to a new node: this >> means that to have all existing N nodes in your cluster participate in >> adding K nodes, you only need to add N / RF = K nodes per expansion: this is >> a much easier factor to achieve than a power of 2. >> >> While the added nodes will not be immediately balanced, there are some >> possible improvements to our existing load-balancing facilities to better >> handle unbalanced cases: see >> https://issues.apache.org/jira/browse/CASSANDRA-1418 >> >> Finally, virtual nodes are not a panacea: reviewing the papers on >> https://issues.apache.org/jira/browse/CASSANDRA-192 suggests that they are >> significantly more difficult to implement than our current solution. >> >> We haven't ruled virtual nodes out, but I think many of us are leaning >> toward exploring improvements to our current architecture. >> >> Thanks, >> Stu >> >> -----Original Message----- >> From: "Greg Kim" <g...@netflix.com> >> Sent: Tuesday, October 26, 2010 12:21pm >> To: "user@cassandra.apache.org" <user@cassandra.apache.org> >> Subject: Best practice for adding new nodes to ring >> >> Hi, >> >> I have a question regarding the best practices for adding new nodes to an >> existing cluster. From reading the following wiki: >> http://wiki.apache.org/cassandra/Operations -- I understand that when >> creating a brand new cluster -- we can use the following to calculate the >> initial token for each node to achieve balance in the ring: >> def tokens(nodes): >> for i in range(1, nodes + 1): >> print (i * (2 ** 127 - 1) / nodes) >> >> >> My question is on the best practice for adding new nodes to an existing >> cluster. There is a recommendation in the wiki which is to basically to >> compute new tokens for every node and assign them manually using the >> nodetool command. We're planning on running either 16GB or 32GB heaps on >> each of our nodes, so token re-assignment for each node in the cluster >> sounds like a very expensive operation especially in situations where we're >> adding new nodes to handle scaling issues w/ the existing cluster. >> >> I'm bit of a noob to cassandra, so wanted to see how others are currently >> coping w/ this. One option can be to grow the cluster in the power of 2 and >> use bootstraping w/ automatic token generation. Is this an option that >> people are using? (but this gets exponentially expensive when you already >> have a large # of nodes) >> >> Does anyone know why cassandra doesn't use virtual tokens (e.g. one node >> token - creating 256 virtual node tokens in the ring)? This way adding new >> nodes to an existing cluster will significantly mitigate the unbalance issue >> in the ring. >> >> >> Thanks >> gkim >> >> > > One could implement "Virtual nodes" by running multiple instances of > cassandra on a single machine, each binding to a different IP, > possibly each using a different physical disk. > > I can imagine this would cause some overhead and waste. However since > current JVM's do not manage large heap sizes well this would be the > way I would imagine running cassandra on a "Big iron/mainframe" > machine with 128GB RAM 4 processors and 48 disks
You'd just want to make sure you have the IO capacity to handle it. Personally, I think 8- or possibly 4- way systems would be up to the task CPU-wise, but you'd have to think long and hard about how you would manage disk IO. Gary.