Bryce, Have you considered using CompositeColumns and a standard CF? Row key is the UUID column name is (timestamp : dir_entry) you can then slice all columns with a particular time stamp.
Even if you have a random key, I would use the RP unless you have an extreme use case. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/12/2011, at 3:06 AM, Bryce Allen wrote: > I think it comes down to how much you benefit from row range scans, and > how confident you are that going forward all data will continue to use > random row keys. > > I'm considering using BOP as a way of working around the non indexes > super column limitation. In my current schema, row keys are random > UUIDs, super column names are timestamps, and columns contain a > snapshot in time of directory contents, and could be quite large. If > instead I use row keys that are (uuid)-(timestamp), and use a standard > column family, I can do a row range query and select only specific > columns. I'm still evaluating if I can do this with BOP - ideally the > token would just use the first 128 bits of the key, and I haven't found > any documentation on how it compares keys of different length. > > Another trick with BOP is to use MD5(rowkey)-rowkey for data that has > non uniform row keys. I think it's reasonable to use if most data is > uniform and benefits from range scans, but a few things are added that > aren't/don't. This trick does make the keys larger, which increases > storage cost and IO load, so it's probably a bad idea if a significant > subset of the data requires it. > > Disclaimer - I wrote that wiki article to fill in a documentation gap, > since there were no examples of BOP and I wasted a lot of time before I > noticed the hex byte array vs decimal distinction for specifying the > initial tokens (which to be fair is documented, just easy to miss on a > skim). I'm also new to cassandra, I'm just describing what makes sense > to me "on paper". FWIW I confirmed that random UUIDs (type 4) row keys > really do evenly distribute when using BOP. > > -Bryce > > On Mon, 19 Dec 2011 19:01:00 -0800 > Drew Kutcharian <d...@venarc.com> wrote: >> Hey Guys, >> >> I just came across >> http://wiki.apache.org/cassandra/ByteOrderedPartitioner and it got me >> thinking. If the row keys are java.util.UUID which are generated >> randomly (and securely), then what type of partitioner would be the >> best? Since the key values are already random, would it make a >> difference to use RandomPartitioner or one can use >> ByteOrderedPartitioner or OrderPreservingPartitioning as well and get >> the same result? >> >> -- Drew >>