Re: Docs: Token Selection

AJ Wed, 15 Jun 2011 19:26:10 -0700

Ok. I understand the reasoning you laid out. But, I think it should bedocumented more thoroughly. I was trying to get an idea as to howflexible Cass lets you be with the various combinations of strategies,snitches, token ranges, etc..

It would be instructional to see what a graphical representation of acluster ring with multiple data centers looks like. Google turned-upnothing. I imagine it's a multilayer ring; one layer per data centerwith the nodes of one layer slightly offset from the ones in the other(based on the example in the wiki). I would also like to know whichnode is next in the ring such so as to understand replica placement in,for example, the OldNetworkTopologyStrategy when it's doc states,

"...It places one replica in a different data center from the first (ifthere is any such data center), the third replica in a different rack inthe first datacenter, and any remaining replicas on the first unusednodes on the ring."

I can only assume for now that "the ring" referred to is the "local"ring of the first data center.



On 6/15/2011 5:51 PM, Vijay wrote:

No it wont.... it will assume you are doing the right thing...

Regards,
</VJ>

On Wed, Jun 15, 2011 at 2:34 PM, AJ <a...@dude.podzone.net<mailto:a...@dude.podzone.net>> wrote:


    Vijay, thank you for your thoughtful reply.  Will Cass complain if
    I don't setup my tokens like in the examples?


    On 6/15/2011 2:41 PM, Vijay wrote:

    All you heard is right...
    You are not overriding Cassandra's token assignment by saying
    here is your token...

    Logic is:
    Calculate a token for the given key...
    find the node in each region independently (If you use NTS and if
    you set the strategy options which says you want to replicate to
    the other region)...
    Search for the ranges in each region independntly
    Replicate the data to that node.

    For multi DC cassandra needs nodes to be equally partitioned
    within each dc (If you care that the load equally
    distributed).... as well as there shouldn't be any collusion of
    tokens within a cluster....

    The documentation tried to explain the same and the example in
    the documentation.
    Hope this clarifies...

    More examples if it helps....

    DC1 Node 1 : token 0
    DC1 Node 2 : token 8..

    DC2 Node 1 : token 4..
    DC2 Node 1 : token 12..

    or

    DC1 Node 1 : token 0
    DC1 Node 2 : token 1..

    DC2 Node 1 : token 8..
    DC2 Node 1 : token  7..

    Regards,
    </VJ>



    On Wed, Jun 15, 2011 at 12:28 PM, AJ <a...@dude.podzone.net
    <mailto:a...@dude.podzone.net>> wrote:

        On 6/15/2011 12:14 PM, Vijay wrote:

        Correction....

        "The problem in the above approach is you have 2 nodes
        between 12 to 4 in DC1 but from 4 to 12  you just have 1"

        should be

        "The problem in the above approach is you have 1 node
        between 0-4 (25%) and and one node covering the rest which
        is 4-16, 0-0 (75%)"

        Regards,
        </VJ>


        Ok, I think you are saying that the computed token range
        intervals are incorrect and that they would be:

        DC1
        *node 1 = 0      Range: (4, 16], (0, 0]

        node 2 = 4      Range: (0, 4]

        DC2
        *node 3 = 8      Range: (12, 16], (0, 8]

        node 4 = 12   Range: (8, 12]

        If so, then yes, this is what I am seeking to confirm since I
        haven't found any documentation stating this directly and
        that reference that I gave only implies this; that is, that
        the token ranges are calculated per data center rather than
        per cluster.  I just need someone to confirm that 100%
        because it doesn't sound right to me based on everything else
        I've read.

        SO, the question is:  Does Cass calculate the consecutive
        node token ranges A.) per cluster, or B.) for the whole data
        center?

        From all I understand, the answer is B.  But, that
        documentation (reprinted below) implies A... or something
        that doesn't make sense to me because of the token placement
        in the example:

        "With NetworkTopologyStrategy, you should calculate the
        tokens the nodes in each DC independantly...

        DC1 node 1 = 0 node 2 =
        85070591730234615865843651857942052864 DC2 node 3 = 1 node 4
        = 850705917302346158658436518579
        42052865"


        However, I do see why this would be helpful, but first I'm just asking 
if this token assignment is absolutely mandatory
        or if it's just a technique to achieve some end.

Re: Docs: Token Selection

Reply via email to