Suppose I have a cassandra cluster with the data that is skewed such that one node have 40% more data than other nodes.Since while creating the cassandra the tokens were distributed uniformly. Now to make the data uniform I have to recalculate the tokens and assign them to nodes in the cluster. Then run repair and cleanup. The question is How to recalculate the tokens and assign them to nodes(Keeping cost ,distance between nodes and data movement in mind)
Regards Akshit Jain B-Tech,2013124 9891724697 On Wed, Sep 13, 2017 at 11:54 AM, Hannu Kröger <hkro...@gmail.com> wrote: > Hi, > > you should make sure that token range is evenly distributed if you have a > single token configured per node. You can use e.g. this tool to calculate > tokens: > https://www.geroba.com/cassandra/cassandra-token-calculator/ > > Also, make sure that none of the partitions in your data model are > hotspots that contain a lot more data than on average. Check also > materialized views if you use them. > > Also, due to way the compactions work, it’s normal that the disk usage > goes up and down. Since nodes often do that in different rhythms, you > always see that some node(s) are using more disk space than others if some > point of time especially if you do updates&deletes and not just inserts. > > Cheers, > Hannu > > On 13 September 2017 at 07:47:09, Akshit Jain (akshit13...@iiitd.ac.in) > wrote: > > Hi, > Can a cassandra cluster be unbalanced in terms of data? > If yes then how to rebalance a cassandra cluster. > >