I want to do ranged row queries for a few of my column families, but best practice seems to be to use the random partitioner. Splitting my column families between two clusters (one random, one ordered) seems like a pretty expensive compromise.

Instead, I'm thinking of using the order-preserving partitioner in my cluster, but distributing load for most of my column families by hashing the row keys in my application code. Then, for the few column families which I need to slice rows, I can just use unhashed keys.

What is be the effective difference between hashing the keys myself and letting the random partitioner do it? Is this advisable?

Thanks,
Todd

Reply via email to