On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
> The advantage to doing it the way Cassandra does is that you can keep > keys sorted with OrderPreservingPartitioner for range scans. grabbing > one token of many from each node in the ring would prohibit that. > > So we rely on active load balancing to get to a "good enough" balance, > say within 50%. It doesn't need to be perfect. > This makes sense for the order preserving partitioner. But for the random partitioner multiple tokens per node would certainly make balancing easier... I haven't dug into that bit of the Cassandra implementation yet. Would it be very difficult to support both modes of operation? For what it's worth, we've already seen annoying behavior when adding nodes to the cluster. It's obviously true that the absolute size of partitions becomes smaller as the cluster grows, but if your relatively balanced 100 node cluster is at, say, 70% capacity and you add 10 more nodes you would presumably want this additional capacity to be evenly distributed. And right now that's pretty much impossible to do without rebalancing the entire cluster. Mike