Hi,

Have some thoughts on load balancing on current / new nodes. I have come
across some posts around this, but not sure of what is being finally
proposed, so..

>From what I have read, a nodebalance on a node does a decommission and
bootstrap of that node. Is there a reason why it is that way (decommission
and bootstrap) and not just a simple look at my next neighbor and just split
the load with it? As in if the ring has nodes A, B, C and D with load (in
GB) on these respectively is 100, 70, 100, 80. Then a nodetool balance on B
should result in 100, 85, 85, 80 (some tokens move from C to B). It is still
manual but data movement is only what is needed – 15 GB instead of the
100+GB (decommission and bootstrap) . The idea is not to get a perfect
balance, but an acceptable balance with less data movement.

Also when a new node is added, it takes 50% from the most loaded node. Don't
we want to rebalance such that the load is more or less evenly distributed
across the cluster? Would it not help if I could just specify the % load as
a parameter to rebalance command, so that I can optimize the moment of data
for rebalancing. E.g. A,B,C,E is a cluster with load being 80, 78, 83, 84.
Now I add a new node D (position will be before E), so eventually after all
the rebalance activity I want the load to be ~66 (245/5) . Now to minimize
the movement of data and still get a good balance, we move only what is
needed (so data sort of flows from more to less loaded nodes until
balanced). This could be a manual process (I am basically suggesting a
similar approach as in paragraph one). 

Another thought is that instead of using pure current usage on a node to
determine load, shouldn't there be higher level concept like "node weight"
to handle heterogeneous nodes or is the expectation that all nodes are more
or less equal?


Thanks
Anand
-- 
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Question-on-load-balancing-in-a-cluster-tp5375140p5375140.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Reply via email to