On 7 April 2010 19:13, Jonathan Ellis <jbel...@gmail.com> wrote: > One thing you can do is manually "randomize" keys for any CFs that > don't need the OP by pre-pending their md5 to the key you send > Cassandra. (This is all RP is doing under the hood anyway.) >
Another possibility is to prepend some hash of something that you don't need to range scan on to the beginning of the keys. For example, if you have thousands of customers, but they individually want to do range scans, then you can hash the customer ID and put that at the beginning (I use a 16-bit hex hash, it gives enough distribution with sane amounts of nodes). Then you'll tend to get keys which start with 0000 - ffff followed by whatever your increasing key is (timestamp etc). Workloads should tend to balance out but will get a bit patchy if you have, for example, a small number of disproportionately huge customers. Mark