On Nov 9, 2011, at 3:49 PM, Nate Lawson wrote:

> On Nov 9, 2011, at 3:33 PM, Elias Levy wrote:
> 
>> On Wed, Nov 9, 2011 at 3:29 PM, Phil Stanhope <stanh...@gmail.com> wrote:
>> Tread carefully here ... by forcing localilty ... you will sacrifice high 
>> availability by algorithmically creating a bias and a single point of 
>> failure in the cluster. 
>> 
>> You don't have to loose high availability, your data is still being 
>> replicated, but you can create hot spots.  Known your data.
> 
> Correct. Partitioning based on SHA-1(DocumentID) is the same situation as 
> doing it based on SHA-1(entire_key), which is how Riak currently works. Even 
> if "entire_key" and "DocumentID" are both just simple counters, it is the 
> same situation.
> 
> We would only need worry if the pair BucketName + DocumentID was not unique 
> (say, skewed towards 0 or something). In that case, we'd need to analyze the 
> distribution of DocumentID values to be sure the partition is balanced.


Sorry to reply to myself, but I wanted to add more detail.

You have multiple ways you could generate partitions: bucket, key-prefix, key, 
or even key+value. The question is really, "how many items do I need before the 
law of large numbers gets me enough balancing?" The answer depends on the data, 
as Elias mentioned.

Obviously, partitioning based only on bucket would be bad if you wrote mostly 
to one bucket. But more subtly, you could write equally to all buckets but 
store the largest or most frequently-accessed values in only one bucket.

Even Riak's current partitioning scheme could be imbalanced if you only stored 
large values in keys whose SHA-1 has a certain prefix. That's admittedly 
extremely unlikely, which is why Riak chose this scheme. But it could happen.

Anyway, overriding the default partitioning function is something that should 
always be an advanced-only feature and "know your data" first ...

-Nate


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to