Hi, Have some thoughts on load balancing on current / new nodes. I have come across some posts around this, but not sure of what is being finally proposed, so..
>From what I have read, a nodebalance on a node does a decommission and bootstrap of that node. Is there a reason why it is that way (decommission and bootstrap) and not just a simple look at my next neighbor and just split the load with it? As in if the ring has nodes A, B, C and D with load (in GB) on these respectively is 100, 70, 100, 80. Then a nodetool balance on B should result in 100, 85, 85, 80 (some tokens move from C to B). It is still manual but data movement is only what is needed – 15 GB instead of the 100+GB (decommission and bootstrap) . The idea is not to get a perfect balance, but an acceptable balance with less data movement. Also when a new node is added, it takes 50% from the most loaded node. Don't we want to rebalance such that the load is more or less evenly distributed across the cluster? Would it not help if I could just specify the % load as a parameter to rebalance command, so that I can optimize the moment of data for rebalancing. E.g. A,B,C,E is a cluster with load being 80, 78, 83, 84. Now I add a new node D (position will be before E), so eventually after all the rebalance activity I want the load to be ~66 (245/5) . Now to minimize the movement of data and still get a good balance, we move only what is needed (so data sort of flows from more to less loaded nodes until balanced). This could be a manual process (I am basically suggesting a similar approach as in paragraph one). Another thought is that instead of using pure current usage on a node to determine load, shouldn't there be higher level concept like "node weight" to handle heterogeneous nodes or is the expectation that all nodes are more or less equal? Thanks Anand -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Question-on-load-balancing-in-a-cluster-tp5375140p5375140.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.