I don't think that queries on a key range are valid unless you are using OPP. As far as hashing the key for OPP goes, I take it to be the same a not using OPP. It's really a matter of where it gets done, but it has much the same effect. (I think)
Jonathan On Wed, May 26, 2010 at 12:51 PM, Peter Hsu <pe...@motivecast.com> wrote: > Correct me if I'm wrong here. Even though you can get your results with > Random Partitioner, it's a lot less efficient if you're going across > different machines to get your results. If you're doing a lot of range > queries, it makes sense to have things ordered sequentially so that if you > do need to go to disk, the reads will be faster, rather than lots of random > reads across your system. > It's also my understanding that if you go with the OPP, you could hash your > key yourself using md5 or sha-1 to effectively get random partitioning. So > it's a bit of a pain, but not impossible to do a split between OPP and RP > for your different columnfamily/keyspaces. > On May 26, 2010, at 2:32 AM, David Boxenhorn wrote: > > Just in case you don't know: You can do range searches on keys even with > Random Partitioner, you just won't get the results in order. If this is good > enough for you (e.g. if you can order the results on the client, or if you > just need to get the right answer, but not the right order), then you should > use Random Partitioner. > > (I bring this up because it confused me until recently.) > > On Wed, May 26, 2010 at 5:14 AM, Steve Lihn <stevel...@gmail.com> wrote: >> >> I have a question on using Order Preserving Partitioner. >> >> Many rowKeys in my system will be related to dates, so it seems natural to >> use Order Preserving Partitioner instead of the default Random Partitioner. >> However, I have been warned that special attention has to be applied for >> Order Preserving Partitioner to work properly (basically to ensure a good >> key distribution and avoid "hot spot") and reverting it back to Random may >> not be easy. Also not every rowKey is related to dates, for these, using >> Random Partitioner is okay, but there is only one place to set Partitioner. >> >> (Note: The intension of this warning is actually to discredit Cassandra >> and persuade me not to use it.) >> >> It seems the choice of Partitioner is defined in the storage-conf.xml and >> is a global property. My question why does it have to be a global property? >> Is there a future plan to make it customizable per KeySpace (just like you >> would choose hash or range partition for different table/data in RDBMS) ? >> >> Thanks, >> Steve > > >