Unless the issue is "I have some giant partitions mixed in with non-giant ones" the usual reason for "data size imbalance" is STCS is being used.
You can look at nodetool cfhistograms and cfstats to get info about partition sizes. If you copy the data off to a test node, and run "nodetool compact" does the data size drop down a bunch? That will tell you if its "compaction didn't merge yet" or "actually have more data in that token range" Now "where did the data come from?". If its the compaction thing, most likely repair over streaming on wide partitions. It could also be that you did a bunch of deletes, and the tombstones have been compacted with the "Data to be deleted" on some nodes and not others. Probably not these, but here are a few operations things that could also cause this: Ran "rebuild" on one of the nodes after it already had data. Wipe a node and put the data back using "repair" not bootstrap or rebuild. sstableload a backup over top of existing data Now "how do I fix it", if your use case is such that LCS makes sense http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/tabProp.html?scroll=tabProp__moreCompaction http://www.datastax.com/dev/blog/when-to-use-leveled-compaction Then using LCS will make the data compact together more quickly, at the expense of a bunch of extra disk io. -Jeremiah On May 12, 2014, at 9:19 AM, Oleg Dulin <oleg.du...@gmail.com> wrote: > I keep asking same question it seems -- sign of insanity. > > Cassandra version 1.2, not using vnodes (legacy). > > On 2014-03-07 19:37:48 +0000, Robert Coli said: > >> On Fri, Mar 7, 2014 at 6:00 AM, Oleg Dulin <oleg.du...@gmail.com> wrote: >> I have the following situation: >> 10.194.2.5 RAC1 Up Normal 378.6 GB 50.00% >> 0 >> 10.194.2.4 RAC1 Up Normal 427.5 GB 50.00% >> 127605887595351923798765477786913079295 >> 10.194.2.7 RAC1 Up Normal 350.63 GB 50.00% >> 85070591730234615865843651857942052864 >> 10.194.2.6 RAC1 Up Normal 314.42 GB 50.00% >> 42535295865117307932921825928971026432 >> As you can see, the 2.4 node has over 100 G more data than 2.6 . You can >> definitely see the imbalance. It also happens to be the heaviest loaded node >> by CPU usage. >> The first step is to understand why. >> Are you using vnodes? What version of Cassandra? >> >> What would be a clean way to rebalance ? If I use move operation follwoed by >> cleanup, would it require a repair afterwards ? >> Move is not, as I understand it, subject to CASSANDRA-2434, so should not >> require a post-move repair. >> =Rob >> > > L > >