I got myself into a situation where one node (10.47.108.100) has a lot more data than the other nodes. In fact, the 1 TB disk on this node is almost full. I added 3 new nodes and let cassandra automatically calculate new tokens by taking the highest loaded nodes. Unfortunately there is still a big token range this node is responsible for (5113... - 85070...). Yes, I know that one option would be to rebalance the entire cluster with move but this is an extremely time-consuming and error-prone process because of the amount of data involved.
Our RF = 3 and we read/write quorum. The nodes have been repaired so I think the data should be in good shape. Question: Can I get myself out of this mess without installing new nodes? I was thinking of either decommission or removetoken to have the cluster "rebalance itself". The re-bootstrap this node to a new token. Address Status State Load Owns Token 127605887595351923798765477786913079296 10.46.108.100 Up Normal 218.52 GB 25.00% 0 10.46.108.101 Up Normal 260.04 GB 12.50% 21267647932558653966460912964485513216 10.46.108.104 Up Normal 286.79 GB 17.56% 51138582157040063602728874106478613120 10.47.108.100 Up Normal 874.91 GB 19.94% 85070591730234615865843651857942052863 10.47.108.102 Up Normal 302.79 GB 4.16% 92156241323118845370666296304459139297 10.47.108.103 Up Normal 242.02 GB 4.16% 99241191538897700272878550821956884116 10.47.108.101 Up Normal 439.9 GB 8.34% 113427455640312821154458202477256070484 10.46.108.103 Up Normal 304 GB 8.33% 127605887595351923798765477786913079296