It really does depend on what your workload is like, and in the end will 
involve a certain amount of fudge factor. 
 
http://wiki.apache.org/cassandra/CassandraHardware provides some guidance. 

http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a rough 
idea of the memory requirements. Note that secondary indexes are also CF's with 
the same memory settings as the parent. 

With RF3 you can lose afford to lose one replica for a key a token range and 
still be available (Assuming Quorum CL). With RF 5 you can lose 2 replicas and 
still be available for the keys in the range. 

I'm been careful to say "lose X replicas" because the other nodes in the 
cluster don't count when considering an operation for a key. Two examples, 9 
node cluster with RF3. If you lose nodes 2 and 3 and they are replicas for node 
1, Quorum operations on keys in the range for node 1 will fail (ranges for 2 
and 3 will be ok). If you lose nodes 2 and 5 Quorum operations will succeed for 
all keys. 

RF 3 is reasonable starting point for some redundancy, RF 5 is more. After that 
it's Web Scale (tm).

Hope that helps
Aaron
 
On 24 Mar 2011, at 04:04, Brian Fitzpatrick wrote:

> I'm going through the process of specing out the hardware for a
> Cassandra cluster. The relevant specs:
> 
> - Support 460 operations/sec (50/50 read/write workload). Row size
> ranges from 4 to 8K.
> - Support 29 million objects for the first year
> - Support 365 GB storage for the first year, based on Cassandra tests
> (data + index + overhead * replication factor of 3)
> 
> I'm looking for advice on the node size for this cluster, recommended
> RAM per node, and whether RF=3 seems to be a good choice for general
> availability and resistance to failure.
> 
> I've looked at the YCSB benchmark paper and through the archives of
> this email list looking for pointers.  I haven't found any general
> guidelines on recommended cluster size to support X operations/sec
> with Y data size at RF factor of Z, that I could extrapolate from.
> 
> Any and all recommendations appreciated.
> 
> Thanks,
> Brian

Reply via email to