Correct me if I'm wrong here.  Even though you can get your results with Random 
Partitioner, it's a lot less efficient if you're going across different 
machines to get your results.  If you're doing a lot of range queries, it makes 
sense to have things ordered sequentially so that if you do need to go to disk, 
the reads will be faster, rather than lots of random reads across your system.

It's also my understanding that if you go with the OPP, you could hash your key 
yourself using md5 or sha-1 to effectively get random partitioning.  So it's a 
bit of a pain, but not impossible to do a split between OPP and RP for your 
different columnfamily/keyspaces.

On May 26, 2010, at 2:32 AM, David Boxenhorn wrote:

> Just in case you don't know: You can do range searches on keys even with 
> Random Partitioner, you just won't get the results in order. If this is good 
> enough for you (e.g. if you can order the results on the client, or if you 
> just need to get the right answer, but not the right order), then you should 
> use Random Partitioner. 
> 
> (I bring this up because it confused me until recently.) 
> 
> On Wed, May 26, 2010 at 5:14 AM, Steve Lihn <stevel...@gmail.com> wrote:
> I have a question on using Order Preserving Partitioner. 
> 
> Many rowKeys in my system will be related to dates, so it seems natural to 
> use Order Preserving Partitioner instead of the default Random Partitioner. 
> However, I have been warned that special attention has to be applied for 
> Order Preserving Partitioner to work properly (basically to ensure a good key 
> distribution and avoid "hot spot") and reverting it back to Random may not be 
> easy. Also not every rowKey is related to dates, for these, using Random 
> Partitioner is okay, but there is only one place to set Partitioner.
> 
> (Note: The intension of this warning is actually to discredit Cassandra and 
> persuade me not to use it.)
> 
> It seems the choice of Partitioner is defined in the storage-conf.xml and is 
> a global property. My question why does it have to be a global property? Is 
> there a future plan to make it customizable per KeySpace (just like you would 
> choose hash or range partition for different table/data in RDBMS) ?  
> 
> Thanks,
> Steve 
> 

Reply via email to