On Thu, Mar 25, 2010 at 1:17 PM, Mike Malone <m...@simplegeo.com> wrote:
> On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>
>> The advantage to doing it the way Cassandra does is that you can keep
>> keys sorted with OrderPreservingPartitioner for range scans.  grabbing
>> one token of many from each node in the ring would prohibit that.
>>
>> So we rely on active load balancing to get to a "good enough" balance,
>> say within 50%.  It doesn't need to be perfect.
>
> This makes sense for the order preserving partitioner. But for the random
> partitioner multiple tokens per node would certainly make balancing
> easier... I haven't dug into that bit of the Cassandra implementation yet.
> Would it be very difficult to support both modes of operation?

I guess that depends on what your threshold of "very" difficult is,
doesn't it. :)

Pretty much everything assumes that there is a 1:1 correspondence
between IP and Token.  It's probably in the ballpark of "one month to
code, two to get the bugs out."  Gossip is one of the trickier parts
of our code base, and this would be all over that.  The actual storage
system changes would be simpler I think.

-Jonathan

Reply via email to