The advantage to doing it the way Cassandra does is that you can keep keys sorted with OrderPreservingPartitioner for range scans. grabbing one token of many from each node in the ring would prohibit that.
So we rely on active load balancing to get to a "good enough" balance, say within 50%. It doesn't need to be perfect. On Thu, Mar 25, 2010 at 10:37 AM, Daniel Kluesing <d...@bluekai.com> wrote: > I wanted to check my understanding of the load balance operation. Let's say I > have 5 nodes, each of them has been assigned at startup 1/5 of the ring, and > the load is equal across them (say using random partitioner). The load on the > cluster gets high, so I add a sixth server. During bootstrap, the new server > will pick the existing server with the highest load, and take half the load > from that server. After boot strap, I would end up with 4 servers with 1/5 of > the ring each, and 2 servers with 1/10 of the ring each - is this correct? > I'll get hotspots unless I double the number of nodes? > > Really I would want adding a sixth server to result in six machines with 1/6 > of the load taken evenly from the existing nodes. If I understand - and > correct me if I'm wrong -, the core of this is that each server is assigned > one token, while in a system like dynamo, a server is assigned multiple > tokens around the ring. Is there any benefit to only assigning one token? Has > anyone worked on assigning a server multiple tokens, or is there some other > unrelated way to get more even load distribution when adding a node? >