On Mon, Aug 1, 2011 at 8:24 AM, Rafael Almeida <almeida...@yahoo.com> wrote:
> On Saturday, July 30, 2011, Rafael Almeida <almeida...@yahoo.com> wrote: > > Hello, > > > > I have computers that are better than others in my cluster. In special, > > there's one which is much better and I'd like to give it more load than > the > > others. Is it possible? I'm using RandomPartitioner, should I use other? > > Should I select tokens in some particular way? How is load distribution > > implemented in RandomPartitioner with respect to tokens? > > > > I'm answering myself this time. I think I've got things figured out, at > least > for RandomPartitioner. The token space goes from 0 to 2^217. There are > 2^217 > tokens possible. The load a node will receive is proportional to the number > of > tokens assigned to it. If you assign 2^217 / 2 tokens to a node, it will be > responsible for half the load in the system. If you assign 2^217 / 3 tokens > to a > node it will be responsible for 1/3 the load and so on. > > But you assign only one token in cassandra's configuration file! True, but > that's the first token for that node, in a range of tokens it will accept. > The > number of tokens actually assigned to it is the range from the value you > wrote > in intiial_token in cassandra.yaml up to the next token. > > I find it hard to explain that without an example. So, let's say the token > space > is actually from 0 to 100 and we have 4 nodes (let's do this in order to > make > things more manageble). In our example, we have the following > initial_tokens: > > node A = 0 > node B = 20 > node C = 70 > node D = 90 > > Node A would have 0 - 20 tokens assigned to it (20/100 = 20% of the load). > Node > B would have 70 - 20 = 50 tokens assigned to it (50% of the load). Node C > would > have 90 - 70 = 20 tokens assigned to it (20% of the load) and, finally, > node D > would have 10% of the tokens assigned to it. See how that works? > If you mess up in your configuration. Let's say you set up initial_token > like > this: > > node A = 10 > node B = 20 > node C = 70 > node D = 90 > > That way you'd have 10 unhandled tokens. I think cassandra detects it and > set > things up in a way no token is missing. But I'm not sure what it does > exactly. > I've tested it with two nodes and, when I make such invalid configuration, > I get > each node handling 50% of the load. > > There would be no missing token, node A will take care of token range (90, 100] and [0, 10]. I hope I've been clear. Please correct me if I misunderstood something. > >