RE: OrderPreservingPartitioner limits and workarounds

2010-04-29 Thread Mark Jones
M To: cassandra user Subject: OrderPreservingPartitioner limits and workarounds I have one append-oriented workload and I would like to know if Cassandra is appropriate for it. Given: * 100 nodes * an OrderPreservingPartitioner * a replication factor of "3" * a write-pattern

Re: OrderPreservingPartitioner limits and workarounds

2010-04-08 Thread Mark Robson
On 7 April 2010 19:13, Jonathan Ellis wrote: > One thing you can do is manually "randomize" keys for any CFs that > don't need the OP by pre-pending their md5 to the key you send > Cassandra. (This is all RP is doing under the hood anyway.) > Another possibility is to prepend some hash of somet

Re: OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Jonathan Ellis
One thing you can do is manually "randomize" keys for any CFs that don't need the OP by pre-pending their md5 to the key you send Cassandra. (This is all RP is doing under the hood anyway.) On Wed, Apr 7, 2010 at 5:51 AM, Paul Prescod wrote: > I have one append-oriented workload and I would like

Re: OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Paul Prescod
Since I wrote that at 3:51AM (my time) I came to many of the same conclusions and decided to write them up to try and provide a high-level guide on sorting and ordering. * http://jottit.com/s8c4a/ But for completeness I was still hoping to document any workarounds that would help mitigate load b

Re: OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Benjamin Black
I'd suggest you use RandomPartitioner, an index, and multiget. You'll be able to do range queries and won't have the load imbalance and performance problems of OPP and native range queries. b On Wed, Apr 7, 2010 at 3:51 AM, Paul Prescod wrote: > I have one append-oriented workload and I would

OrderPreservingPartitioner limits and workarounds

2010-04-07 Thread Paul Prescod
I have one append-oriented workload and I would like to know if Cassandra is appropriate for it. Given: * 100 nodes * an OrderPreservingPartitioner * a replication factor of "3" * a write-pattern of "always append" * a strong requirement for range queries My understanding is that there