Re: Generating evenly distributed tokens for vnodes

Kornel Pal Wed, 27 May 2020 11:53:09 -0700

As I understand, the previous discussion is about usingallocate_tokens_for_keyspace for allocating tokens for most of thenodes. On the other hand, I am proposing to generate all the tokens forall the nodes using a Python script.

This seems to result in perfectly even token ownership distributionacross all the nodes for all possible replication factors, thus being animprovement over using allocate_tokens_for_keyspace.


Elliott Sims wrote:

There's also a slightly older mailing list discussion on this subjectthat goes into detail on this sort of strategy:https://www.mail-archive.com/user@cassandra.apache.org/msg60006.html

I've been approximately following it, repeating steps 3-6 for the firsthost in each "rack(replica, since I have 3 racks and RF=3) then 8-10 forthe remaining hosts in the new datacenter. So far, so good (sample sizeof 1) but it's a pretty painstaking process

This should get a lot simpler with Cassandra 4+'s"allocate_tokens_for_local_replication_factor" option, which willdefault to 3.

On Wed, May 27, 2020 at 4:34 AM Kornel Pal <kornel...@gmail.com<mailto:kornel...@gmail.com>> wrote:


    Hi,

    Generating ideal tokens for single-token datacenters is well understood
    and documented, but there is much less information available on
    generating tokens with even ownership distribution when using vnodes.
    The best description I could find on token generation for vnodes is
    
https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html

    While allocate_tokens_for_keyspace results in much more even ownership
    distribution than random allocation, and does a great job at balancing
    ownership when adding new nodes, using it for creating a new datacenter
    results in less than ideal ownership distribution.

    After some experimentation, I found that it is possible to generate all
    the tokens for a new datacenter with an extended version of the Python
    script presented in the above blog post. Using these tokens seem to
    result in perfectly even ownership distribution with various
    token/node/rack configurations for all possible replication factors.

    Murmur3Partitioner:
      >>> datacenter_offset = 0
      >>> num_tokens = 4
      >>> num_racks = 3
      >>> num_nodes = 3
      >>> print "\n".join(['[Rack #{}, Node #{}] initial_token:
    {}'.format(r
    + 1, n + 1, ','.join([str(((2**64 / (num_tokens * num_nodes *
    num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) -
    2**63 +
    datacenter_offset) for t in range(num_tokens)])) for r in
    range(num_racks) for n in range(num_nodes)])
    [Rack #1, Node #1] initial_token:
    -9223372036854775808,-4611686018427387908,-8,4611686018427387892
    [Rack #1, Node #2] initial_token:
    
-7686143364045646508,-3074457345618258608,1537228672809129292,6148914691236517192
    [Rack #1, Node #3] initial_token:
    
-6148914691236517208,-1537228672809129308,3074457345618258592,7686143364045646492
    [Rack #2, Node #1] initial_token:
    
-8710962479251732708,-4099276460824344808,512409557603043092,5124095576030430992
    [Rack #2, Node #2] initial_token:
    
-7173733806442603408,-2562047788015215508,2049638230412172392,6661324248839560292
    [Rack #2, Node #3] initial_token:
    
-5636505133633474108,-1024819115206086208,3586866903221301692,8198552921648689592
    [Rack #3, Node #1] initial_token:
    
-8198552921648689608,-3586866903221301708,1024819115206086192,5636505133633474092
    [Rack #3, Node #2] initial_token:
    
-6661324248839560308,-2049638230412172408,2562047788015215492,7173733806442603392
    [Rack #3, Node #3] initial_token:
    
-5124095576030431008,-512409557603043108,4099276460824344792,8710962479251732692

    RandomPartitioner:
      >>> datacenter_offset = 0
      >>> num_tokens = 4
      >>> num_racks = 3
      >>> num_nodes = 3
      >>> print "\n".join(['[Rack #{}, Node #{}] initial_token:
    {}'.format(r
    + 1, n + 1, ','.join([str(((2**127 / (num_tokens * num_nodes *
    num_racks)) * (t * num_nodes * num_racks + n * num_racks + r)) +
    datacenter_offset) for t in range(num_tokens)])) for r in
    range(num_racks) for n in range(num_nodes)])
    [Rack #1, Node #1] initial_token:
    
0,42535295865117307932921825928971026427,85070591730234615865843651857942052854,127605887595351923798765477786913079281
    [Rack #1, Node #2] initial_token:
    
14178431955039102644307275309657008809,56713727820156410577229101238628035236,99249023685273718510150927167599061663,141784319550391026443072753096570088090
    [Rack #1, Node #3] initial_token:
    
28356863910078205288614550619314017618,70892159775195513221536376548285044045,113427455640312821154458202477256070472,155962751505430129087380028406227096899
    [Rack #2, Node #1] initial_token:
    
4726143985013034214769091769885669603,47261439850130342147690917698856696030,89796735715247650080612743627827722457,132332031580364958013534569556798748884
    [Rack #2, Node #2] initial_token:
    
18904575940052136859076367079542678412,61439871805169444791998193008513704839,103975167670286752724920018937484731266,146510463535404060657841844866455757693
    [Rack #2, Node #3] initial_token:
    
33083007895091239503383642389199687221,75618303760208547436305468318170713648,118153599625325855369227294247141740075,160688895490443163302149120176112766502
    [Rack #3, Node #1] initial_token:
    
9452287970026068429538183539771339206,51987583835143376362460009468742365633,94522879700260684295381835397713392060,137058175565377992228303661326684418487
    [Rack #3, Node #2] initial_token:
    
23630719925065171073845458849428348015,66166015790182479006767284778399374442,108701311655299786939689110707370400869,151236607520417094872610936636341427296
    [Rack #3, Node #3] initial_token:
    
37809151880104273718152734159085356824,80344447745221581651074560088056383251,122879743610338889583996386017027409678,165415039475456197516918211945998436105

    Could you please comment on whether this is a good approach for
    allocating tokens when using vnodes.

    Thank you.

    Regards,
    Kornel


    ---------------------------------------------------------------------
    To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
    <mailto:user-unsubscr...@cassandra.apache.org>
    For additional commands, e-mail: user-h...@cassandra.apache.org
    <mailto:user-h...@cassandra.apache.org>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Generating evenly distributed tokens for vnodes

Reply via email to