On Tue, Apr 5, 2011 at 9:59 PM, Jonathan Ellis <jbel...@gmail.com> wrote:
> On Tue, Apr 5, 2011 at 8:37 PM, Yudong Gao <st...@umich.edu> wrote:
>> One thing I am worrying about is how to maintain the location
>> information for each row. The current partitioner maps a key to MD5
>> hash, and it is almost impossible to control the hashed token by
>> manipulating the value of the key. Also, maintaining a key-to-location
>> mapping would be unscalable. My initial thought is to use the key
>> string as the token directly, so that the location information can be
>> binded into the key. This minimize the changes to the other
>> components.
>
> This is what ByteOrderedPartitioner does, but that tends to create hot
> spots since sequential keys are stored on the same node.
>
> A better solution would be to just push the DecoratedKey into the
> ReplicationStrategy so it can make its decision before information is
> thrown away.

I agree. So in this case, I guess the hashed based token ring is still
preserved to avoid hot spot, but we further use the DecoratedKey to
guide the replication strategy. For example, replica 2 is placed in
the first node along the ring the belongs the desirable data center
(based on the location hint embedded DecoratedKey). But we may not be
able to control the primary replica. Do you think this will be a
reasonable design?

>
>> Do you know how the existing application is achieving this without the
>> per-row support?
>
> All existing applications places replicas by keyspace, not by row.
>

I see. So I guess each keyspace is mapped to one data center in the
desired location? Just curious, are they happy with the current
solution with keyspace, and is there some requests for per-row
placement control?

Thanks!

Yudong

> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Reply via email to