Aaron, thank you for the link. What is discussed there is not exactly what I am thinking of. They propose distributing the keys with <MD5(ROWKEY)>.<ROWKEY> - which will distribute the values in a way that cannot easily be reversed. What I am proposing is to distribute the keys evenly among N buckets, where N is much larger than your number of nodes, and then construct my range queries as the union of N range queries that I actually perform on Cassandra.
"You can do range queries with the Random Partitioner in 0.6.*" I went though this before, it's not true. What you can do is loop over your entire set of keys in random order. There is no way to get an actual range other than the whole range. On Wed, Jul 7, 2010 at 1:15 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > That pattern is discussed here > http://ria101.wordpress.com/2010/02/22/cassandra-randompartitioner-vs-orderpreservingpartitioner/ > > It's also used in http://github.com/tjake/Lucandra > > You can do range queries with the Random Partitioner in 0.6.*, the order of > the return is undefined and it's a bit slower. > > I think it's normally used when you want ordered range queries in some CF's > and random distribution in others. > > Aaron > > > On 07 Jul, 2010,at 09:47 PM, David Boxenhorn <da...@lookin2.com> wrote: > > Is there any strategy for using OPP with a hash algorithm on the client > side to get both uniform distribution of data in the cluster *and* the > ability to do range queries? > > I'm thinking of something like this: > > cassKey = (key % 97) + "@" + key; > > cassRange = 0 + "@" + range; 1 + "@" + range; ... 96 + "@" + range; > > Would something like that work? > >