Thanks for the tips on the replication factor. Any thoughts on the number of nodes in a cluster to support an RF=3 with a workload of 400 ops/sec (4-8K sized rows, 50/50 read/write)? Based on the "sweet spot" hardware referenced in the wiki (8-core, 16-32GB RAM), what kink of ops/sec could I reasonably expect from each node. Just looking for a range to make some educated guesses.
Thanks, Brian On Wed, Mar 23, 2011 at 9:04 PM, aaron morton <aa...@thelastpickle.com> wrote: > It really does depend on what your workload is like, and in the end will > involve a certain amount of fudge factor. > > http://wiki.apache.org/cassandra/CassandraHardware provides some guidance. > http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a > rough idea of the memory requirements. Note that secondary indexes are also > CF's with the same memory settings as the parent. > With RF3 you can lose afford to lose one replica for a key a token range and > still be available (Assuming Quorum CL). With RF 5 you can lose 2 replicas > and still be available for the keys in the range. > I'm been careful to say "lose X replicas" because the other nodes in the > cluster don't count when considering an operation for a key. Two examples, 9 > node cluster with RF3. If you lose nodes 2 and 3 and they are replicas for > node 1, Quorum operations on keys in the range for node 1 will fail (ranges > for 2 and 3 will be ok). If you lose nodes 2 and 5 Quorum operations will > succeed for all keys. > RF 3 is reasonable starting point for some redundancy, RF 5 is more. After > that it's Web Scale (tm). > Hope that helps > Aaron > > On 24 Mar 2011, at 04:04, Brian Fitzpatrick wrote: > > I'm going through the process of specing out the hardware for a > Cassandra cluster. The relevant specs: > > - Support 460 operations/sec (50/50 read/write workload). Row size > ranges from 4 to 8K. > - Support 29 million objects for the first year > - Support 365 GB storage for the first year, based on Cassandra tests > (data + index + overhead * replication factor of 3) > > I'm looking for advice on the node size for this cluster, recommended > RAM per node, and whether RF=3 seems to be a good choice for general > availability and resistance to failure. > > I've looked at the YCSB benchmark paper and through the archives of > this email list looking for pointers. I haven't found any general > guidelines on recommended cluster size to support X operations/sec > with Y data size at RF factor of Z, that I could extrapolate from. > > Any and all recommendations appreciated. > > Thanks, > Brian > >