Hi Enrico, Glad to hear the problem has been resolved and thank you for the feedback!
Kind regards, Anthony On Mon, 2 Dec 2019 at 22:03, Enrico Cavallin <cavallin.enr...@gmail.com> wrote: > Hi Anthony, > thank you for your hints, now the new DC is well balanced within 2%. > I did read your article, but I thought it was needed only for new > "clusters", not also for new "DCs"; but RF is per DC so it makes sense. > > You TLP guys are doing a great job for Cassandra community. > > Thank you, > Enrico > > > On Fri, 29 Nov 2019 at 05:09, Anthony Grasso <anthony.gra...@gmail.com> > wrote: > >> Hi Enrico, >> >> This is a classic chicken and egg problem with the >> allocate_tokens_for_keyspace setting. >> >> The allocate_tokens_for_keyspace setting uses the replication factor of >> a DC keyspace to calculate the token allocation when a node is added to the >> cluster for the first time. >> >> Nodes need to be added to the new DC before we can replicate the keyspace >> over to it. Herein lies the problem. We are unable to use >> allocate_tokens_for_keyspace unless the keyspace is replicated to the >> new DC. In addition, as soon as you change the keyspace replication to the >> new DC, new data will start to be written to it. To work around this issue >> you will need to do the following. >> >> 1. Decommission all the nodes in the *dcNew*, one at a time. >> 2. Once all the *dcNew* nodes are decommissioned, wipe the contents >> in the *commitlog*, *data*, *saved_caches*, and *hints* directories >> of these nodes. >> 3. Make the first node to add into the *dcNew* a seed node. Set the >> seed list of the first node with its IP address and the IP addresses of >> the >> other seed nodes in the cluster. >> 4. Set the *initial_token* setting for the first node. You can >> calculate the values using the algorithm in my blog post: >> >> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html. >> For convenience I have calculated them: >> *-9223372036854775808,-4611686018427387904,0,4611686018427387904*. >> Note, remove the *allocate_tokens_for_keyspace* setting from the >> *cassandra.yaml* file for this (seed) node. >> 5. Check to make sure that no other node in the cluster is assigned >> any of the four tokens specified above. If there is another node in the >> cluster that is assigned one of the above tokens, increment the >> conflicting >> token by values of one until no other node in the cluster is assigned that >> token value. The idea is to make sure that these four tokens are unique to >> the node. >> 6. Add the seed node to cluster. Make sure it is listed in *dcNew *by >> checking nodetool status. >> 7. Create a dummy keyspace in *dcNew* that has a replication factor >> of 2. >> 8. Set the *allocate_tokens_for_keyspace* value to be the name of the >> dummy keyspace for the other two nodes you want to add to *dcNew*. >> Note remove the *initial_token* setting for these other nodes. >> 9. Set *auto_bootstrap* to *false* for the other two nodes you want >> to add to *dcNew*. >> 10. Add the other two nodes to the cluster, one at a time. >> 11. If you are happy with the distribution, copy the data to *dcNew* >> by running a rebuild. >> >> >> Hope this helps. >> >> Regards, >> Anthony >> >> On Fri, 29 Nov 2019 at 02:08, Enrico Cavallin <cavallin.enr...@gmail.com> >> wrote: >> >>> Hi all, >>> I have an old datacenter with 4 nodes and 256 tokens each. >>> I am now starting a new datacenter with 3 nodes and num_token=4 >>> and allocate_tokens_for_keyspace=myBiggestKeyspace in each node. >>> Both DCs run Cassandra 3.11.x. >>> >>> myBiggestKeyspace has RF=3 in dcOld and RF=2 in dcNew. Now dcNew is very >>> unbalanced. >>> Also keyspaces with RF=2 in both DCs have the same problem. >>> Did I miss something or even with allocate_tokens_for_keyspace I have >>> strong limitations with low num_token? >>> Any suggestions on how to mitigate it? >>> >>> # nodetool status myBiggestKeyspace >>> Datacenter: dcOld >>> ======================= >>> Status=Up/Down >>> |/ State=Normal/Leaving/Joining/Moving >>> -- Address Load Tokens Owns (effective) Host ID >>> Rack >>> UN x.x.x.x 515.83 GiB 256 76.2% >>> fc462eb2-752f-4d26-aae3-84cb9c977b8a rack1 >>> UN x.x.x.x 504.09 GiB 256 72.7% >>> d7af8685-ba95-4854-a220-bc52dc242e9c rack1 >>> UN x.x.x.x 507.50 GiB 256 74.6% >>> b3a4d3d1-e87d-468b-a7d9-3c104e219536 rack1 >>> UN x.x.x.x 490.81 GiB 256 76.5% >>> 41e80c5b-e4e3-46f6-a16f-c784c0132dbc rack1 >>> >>> Datacenter: dcNew >>> ============== >>> Status=Up/Down >>> |/ State=Normal/Leaving/Joining/Moving >>> -- Address Load Tokens Owns (effective) Host ID >>> Rack >>> UN x.x.x.x 145.47 KiB 4 56.3% >>> 7d089351-077f-4c36-a2f5-007682f9c215 rack1 >>> UN x.x.x.x 122.51 KiB 4 55.5% >>> 625dafcb-0822-4c8b-8551-5350c528907a rack1 >>> UN x.x.x.x 127.53 KiB 4 88.2% >>> c64c0ce4-2f85-4323-b0ba-71d70b8e6fbf rack1 >>> >>> Thanks, >>> -- ec >>> >>