Re: how RandomPartitioner calculate tokens

Sylvain Lebresne Wed, 30 Jan 2013 01:48:29 -0800

I'll admit that this part of the DataStax documentation is a bit confusing
(and
I'll reach to the doc writers to make sure this is improved).

The partitioner (being it RandomPartitioner, Murmur3Partitioner or
OrderPreservingPartitioner) is pretty much only a hash function that defines
how to compute the token (it's hash) of a key. In particular, the
partitioner
has no notion whatsoever of data centers and more generally does not depend
in
any way of how many nodes you have.

However, for actually distribute data, each node is assigned a token (or
multiple ones with "vnodes"). Getting an even distribution of data depends
on
the exact token picked for your nodes.

Now, the sentences of the doc you cite actually refer to how to calculate
the
tokens you assign to nodes. In particular, what it describes is pretty much
what the small token-generator tool that comes with Cassandra
(http://goo.gl/rwea9) does, but is not something Cassandra itself actually
does.

Also, that procedure to compute token is pretty much the same for
RandomPartitioner and Murmur3Partitioner, except that the token range for
both
partitioner is not exactly the same. And as a side note, if you use vnodes,
you
don't really have to bother about manually assigning tokens for nodes.

--
Sylvain

On Wed, Jan 30, 2013 at 9:22 AM, Manu Zhang <owenzhang1...@gmail.com> wrote:

>  Hi,
>
> As per the Datastax Cassandra Documentation 1.2,
>
> "for single data center deployments, tokens are calculated by dividing the
> hash range by the number of nodes in the cluster", *does it mean we have
> to recalculate the tokens of keys when nodes come and go?**
> *
> "for multiple data center deployments, tokens are calculated per data
> center so that the hash range is evenly divide for the nodes in each data
> center." *This is understandable, but when I go to the getToken method of
> RandomPartitioner, I can't find any datacenter-aware token calculation* 
> *codes.
>
>
> By the way, the documentation doesn't mention how Murmur3Partitioner
> calculate tokens for multiple data center. Assuming it doesn't calculate
> tokens per data center, what  difference between Murmur3Partitioner and
> RandomPartitioner has made that unnecessary?
>
> *Thanks.
> *
> *Manu Zhang*
>
>
>
>
> *
>

Re: how RandomPartitioner calculate tokens

Reply via email to