On Wed 30 Jan 2013 05:47:59 PM CST, Sylvain Lebresne wrote:
I'll admit that this part of the DataStax documentation is a bit
confusing (and
I'll reach to the doc writers to make sure this is improved).

The partitioner (being it RandomPartitioner, Murmur3Partitioner or
OrderPreservingPartitioner) is pretty much only a hash function that
defines
how to compute the token (it's hash) of a key. In particular, the
partitioner
has no notion whatsoever of data centers and more generally does not
depend in
any way of how many nodes you have.

However, for actually distribute data, each node is assigned a token (or
multiple ones with "vnodes"). Getting an even distribution of data
depends on
the exact token picked for your nodes.

Now, the sentences of the doc you cite actually refer to how to
calculate the
tokens you assign to nodes. In particular, what it describes is pretty
much
what the small token-generator tool that comes with Cassandra
(http://goo.gl/rwea9) does, but is not something Cassandra itself actually
does.

Also, that procedure to compute token is pretty much the same for
RandomPartitioner and Murmur3Partitioner, except that the token range
for both
partitioner is not exactly the same. And as a side note, if you use
vnodes, you
don't really have to bother about manually assigning tokens for nodes.

--
Sylvain


On Wed, Jan 30, 2013 at 9:22 AM, Manu Zhang <owenzhang1...@gmail.com
<mailto:owenzhang1...@gmail.com>> wrote:

    Hi,

    As per the Datastax Cassandra Documentation 1.2,

    "for single data center deployments, tokens are calculated by
    dividing the hash range by the number of nodes in the cluster",
    *does it mean we have to recalculate the tokens of keys when nodes
    come and go?**
    *
    "for multiple data center deployments, tokens are calculated per
    data center so that the hash range is evenly divide for the nodes
    in each data center." *This is understandable, but when I go to
    the getToken method of RandomPartitioner, I can't find any
    datacenter-aware token calculation* *codes.

    By the way, the documentation doesn't mention how
    Murmur3Partitioner calculate tokens for multiple data center.
    Assuming it doesn't calculate tokens per data center, what
    difference between Murmur3Partitioner and RandomPartitioner has
    made that unnecessary?

    *Thanks.
    *
    *Manu Zhang*




    *



Thanks Sylvain, it's all clear now.

Reply via email to