Hello again :), I thought a little bit more about this question, and I was actually wondering if something like this would work:
Imagine 3 node cluster, and create them using: For the 3 nodes: `num_token: 4` Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2, 4611686018427387901` Node 2: `intial_token: -7686143364045646507, -3074457345618258604, 1537228672809129299, 6148914691236517202` Node 3: `intial_token: -6148914691236517206, -1537228672809129303, 3074457345618258600, 7686143364045646503` If you know the initial size of your cluster, you can calculate the total number of tokens: number of nodes * vnodes and use the formula/python code above to get the tokens. Then use the first token for the first node, move to the second node, use the second token and repeat. In my case there is a total of 12 tokens (3 nodes, 4 tokens each) ``` >>> number_of_tokens = 12 >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in range(number_of_tokens)] ['-9223372036854775808', '-7686143364045646507', '-6148914691236517206', '-4611686018427387905', '-3074457345618258604', '-1537228672809129303', '-2', '1537228672809129299', '3074457345618258600', '4611686018427387901', '6148914691236517202', '7686143364045646503'] ``` it actually works nicely apparently. Here is a quick ccm test I have run, with the configuration above: ``` $ ccm node1 nodetool status tlp_lab Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 82.47 KiB 4 66.7% 1ed8680b-7250-4088-988b-e4679514322f rack1 UN 127.0.0.2 99.03 KiB 4 66.7% ab3655b5-c380-496d-b250-51b53efb4c00 rack1 UN 127.0.0.3 82.36 KiB 4 66.7% ad2b343e-5f6e-4b0d-b79f-a3dfc3ba3c79 rack1 ``` Ownership is perfectly distributed, like it would be without vnodes. Tested with C* 3.11.1 and CCM. I followed the procedure we were talking about on my second test, after wiping out the data in my 3 nodes ccm cluster. RF=2 for tlp_lab, the first node with initial_tokens defined and other nodes using 'allocate_tokens_for_keyspace: tlp_lab': $ ccm node1 nodetool status tlp_lab Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 127.0.0.1 86.71 KiB 4 96.2% 6e4c0ce0-2e2e-48ff-b7e0-3653e76366a3 rack1 UN 127.0.0.2 65.63 KiB 4 54.2% 592cda85-5807-4e7a-aa3b-0d9ae54cfaf3 rack1 UN 127.0.0.3 99.04 KiB 4 49.7% f2c4eccc-31cc-458c-a599-5373c1169d3c rack1 This is not as great. I guess a fourth node would help, but still not make it as perfect. I would still check what happens when you add a few more nodes with 'allocate_tokens_for_keyspace' afterward and without 'initial_token', not to have any surprise. I did not see anyone using this yet. Please take it as an idea to dig, and not as a recommendation :). I also noticed I did not answer the second part of the mail: My cluster Size won't go beyond 150 nodes, should i still use The > Allocation Algorithm instead of random with 256 tokens (performance wise or > load-balance wise)? > I would say yes. There is a talk to change this default (256 vnodes), that is now probably always a bad idea since 'allocate_tokens_for_keyspace' was added. Is the Allocation Algorithm, widely used and tested with Community and can > we migrate all clusters with any size to use this Algorithm Safely? > Here again, I would say yes. I am not sure that it is widely used yet, but I think so. Also, you can always check the ownership with 'nodetool status <keyspace>' after adding the nodes, and before adding data or traffic to this data center, so there is probably no real risk if you check ownership distribution after adding nodes. If you don't like the distribution, you can decommission the nodes, clean them, and try again, I use to call it 'rolling the dice' when I am still using the random algorithm :). I mean, once the token ranges ownership are distributed to the nodes, it does not change anything during the transaction. We don't need this 'algorithm' after the bootstrap I would say. > Out of Curiosity, i wonder how people (i.e, in Apple) config and maintain > token management of clusters with thousands of nodes? > I am not sure about Apple, but my understanding is that some of those companies don't use vnodes and have a 'ring management tool' to perform the necessary 'nodetool move' around the cluster relatively easily or automatically. Some other probably use a low number of vnodes (something between 4 and 32) and 'allocate_tokens_for_keyspace'. Also, my understanding is that it's very rare to have clusters with thousands of nodes. You can then start having issues around gossip if I remember correctly what I read/discussed. I would probably add a second cluster when the first one is too big (hundreds of nodes) or split per service/workflow for example. In practice, the operational complexity is reduced by automated operations and/or having a good tooling to operate efficiently. Le lun. 1 oct. 2018 à 12:37, onmstester onmstester <onmstes...@zoho.com> a écrit : > Thanks Alex, > You are right, that would be a mistake. > > Sent using Zoho Mail <https://www.zoho.com/mail/> > > > ============ Forwarded message ============ > From : Oleksandr Shulgin <oleksandr.shul...@zalando.de> > To : "User"<user@cassandra.apache.org> > Date : Mon, 01 Oct 2018 13:53:37 +0330 > Subject : Re: Re: how to configure the Token Allocation Algorithm > ============ Forwarded message ============ > > On Mon, Oct 1, 2018 at 12:18 PM onmstester onmstester <onmstes...@zoho.com> > wrote: > > > > What if instead of running that python and having one node with non-vnode > config, i remove the first seed node and re-add it after cluster was fully > up ? so the token ranges of first seed node would also be assigned by > Allocation Alg > > > I think this is tricky because the random allocation of the very first > tokens from the first seed affects the choice of tokens made by the > algorithm on the rest of the nodes: it basically tries to divide the token > ranges in more or less equal parts. If your very first 8 tokens resulted > in really bad balance, you are not going to remove that imbalance by > removing the node, it would still have the lasting effect on the rest of > your cluster. > > -- > Alex > > > >