how to solve one node is in heavy load in unbalanced cluster

Yan Chunlu Thu, 28 Jul 2011 11:25:47 -0700

I have three nodes and RF=3.here is the current ring:


Address Status State Load Owns Token

84944475733633104818662955375549269696
node1 Up Normal 15.32 GB 81.09% 52773518586096316348543097376923124102
node2 Up Normal 22.51 GB 10.48% 70597222385644499881390884416714081360
node3 Up Normal 56.1 GB 8.43% 84944475733633104818662955375549269696


it is very un-balanced and I would like to re-balance it using
"nodetool move" asap. unfortunately I haven't been run node repair for
a long time.

aaron suggested it's better to run node repair on every node then re-balance it.


problem is the node3 is in heavy-load currently, and the entire
cluster slow down if I start doing node repair. I have to
disablegossip and disablethrift to stop the repair.

only cassandra running on that server and I have no idea what it was
doing. the cpu load is about 20+ currently. compcationstats and
netstats shows it was not doing anything.

I have change client to not to connect to node3, but still, it seems
in heavy load and io utils is 100%.


the log seems normal(although not sure what about the "Dropped read
message" thing):

 INFO 13:21:38,191 GC for ParNew: 345 ms, 627003992 reclaimed leaving
2563726360 used; max is 4248829952
 WARN 13:21:38,560 Dropped 826 READ messages in the last 5000ms
 INFO 13:21:38,560 Pool Name                    Active   Pending
 INFO 13:21:38,560 ReadStage                         8      7555
 INFO 13:21:38,561 RequestResponseStage              0         0
 INFO 13:21:38,561 ReadRepairStage                   0         0



is there anyway to tell what node3 was doing? or at least is there any
way to make it not slowdown the whole cluster?

how to solve one node is in heavy load in unbalanced cluster

Reply via email to