Hi Anthony, thank you for your hints, now the new DC is well balanced within 2%. I did read your article, but I thought it was needed only for new "clusters", not also for new "DCs"; but RF is per DC so it makes sense.
You TLP guys are doing a great job for Cassandra community. Thank you, Enrico On Fri, 29 Nov 2019 at 05:09, Anthony Grasso <anthony.gra...@gmail.com> wrote: > Hi Enrico, > > This is a classic chicken and egg problem with the > allocate_tokens_for_keyspace setting. > > The allocate_tokens_for_keyspace setting uses the replication factor of a > DC keyspace to calculate the token allocation when a node is added to the > cluster for the first time. > > Nodes need to be added to the new DC before we can replicate the keyspace > over to it. Herein lies the problem. We are unable to use > allocate_tokens_for_keyspace unless the keyspace is replicated to the new > DC. In addition, as soon as you change the keyspace replication to the new > DC, new data will start to be written to it. To work around this issue you > will need to do the following. > > 1. Decommission all the nodes in the *dcNew*, one at a time. > 2. Once all the *dcNew* nodes are decommissioned, wipe the contents in > the *commitlog*, *data*, *saved_caches*, and *hints* directories of > these nodes. > 3. Make the first node to add into the *dcNew* a seed node. Set the > seed list of the first node with its IP address and the IP addresses of the > other seed nodes in the cluster. > 4. Set the *initial_token* setting for the first node. You can > calculate the values using the algorithm in my blog post: > > https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html. > For convenience I have calculated them: > *-9223372036854775808,-4611686018427387904,0,4611686018427387904*. > Note, remove the *allocate_tokens_for_keyspace* setting from the > *cassandra.yaml* file for this (seed) node. > 5. Check to make sure that no other node in the cluster is assigned > any of the four tokens specified above. If there is another node in the > cluster that is assigned one of the above tokens, increment the conflicting > token by values of one until no other node in the cluster is assigned that > token value. The idea is to make sure that these four tokens are unique to > the node. > 6. Add the seed node to cluster. Make sure it is listed in *dcNew *by > checking nodetool status. > 7. Create a dummy keyspace in *dcNew* that has a replication factor of > 2. > 8. Set the *allocate_tokens_for_keyspace* value to be the name of the > dummy keyspace for the other two nodes you want to add to *dcNew*. > Note remove the *initial_token* setting for these other nodes. > 9. Set *auto_bootstrap* to *false* for the other two nodes you want to > add to *dcNew*. > 10. Add the other two nodes to the cluster, one at a time. > 11. If you are happy with the distribution, copy the data to *dcNew* > by running a rebuild. > > > Hope this helps. > > Regards, > Anthony > > On Fri, 29 Nov 2019 at 02:08, Enrico Cavallin <cavallin.enr...@gmail.com> > wrote: > >> Hi all, >> I have an old datacenter with 4 nodes and 256 tokens each. >> I am now starting a new datacenter with 3 nodes and num_token=4 >> and allocate_tokens_for_keyspace=myBiggestKeyspace in each node. >> Both DCs run Cassandra 3.11.x. >> >> myBiggestKeyspace has RF=3 in dcOld and RF=2 in dcNew. Now dcNew is very >> unbalanced. >> Also keyspaces with RF=2 in both DCs have the same problem. >> Did I miss something or even with allocate_tokens_for_keyspace I have >> strong limitations with low num_token? >> Any suggestions on how to mitigate it? >> >> # nodetool status myBiggestKeyspace >> Datacenter: dcOld >> ======================= >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> UN x.x.x.x 515.83 GiB 256 76.2% >> fc462eb2-752f-4d26-aae3-84cb9c977b8a rack1 >> UN x.x.x.x 504.09 GiB 256 72.7% >> d7af8685-ba95-4854-a220-bc52dc242e9c rack1 >> UN x.x.x.x 507.50 GiB 256 74.6% >> b3a4d3d1-e87d-468b-a7d9-3c104e219536 rack1 >> UN x.x.x.x 490.81 GiB 256 76.5% >> 41e80c5b-e4e3-46f6-a16f-c784c0132dbc rack1 >> >> Datacenter: dcNew >> ============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns (effective) Host ID >> Rack >> UN x.x.x.x 145.47 KiB 4 56.3% >> 7d089351-077f-4c36-a2f5-007682f9c215 rack1 >> UN x.x.x.x 122.51 KiB 4 55.5% >> 625dafcb-0822-4c8b-8551-5350c528907a rack1 >> UN x.x.x.x 127.53 KiB 4 88.2% >> c64c0ce4-2f85-4323-b0ba-71d70b8e6fbf rack1 >> >> Thanks, >> -- ec >> >