Aaron, How did you get to 1280 writes/sec? Counting 64 writers each taking 5ms for a write cycle, assuming real parallel access with no speed hits, I get 12,800 writes/sec. Am I missing something?
From Jose's iPhone On Mar 24, 2011, at 2:52 PM, aaron morton <aa...@thelastpickle.com> wrote: > Big old guess of something in the 1000's. > > Try benchmarking your work load and plug the numbers (my 5m is pretty high) > in... > > - 8 cores * 8 writers per core = 64 if each write request takes 5ms = 1280 > max per sec > - 1 spindle * 16 readers per spindle = 16 readers if each read request takes > 5ms = 320 max per sec > (reader and writer sizes from the help in conf/cassandra.yaml) > > This is really just a guess, there are a lot more things going on in the > system and it gets even more complicated once it's turned on. But I know > sometimes you just need to show you've thought about it :) > > Hope that helps. > Aaron > > On 25 Mar 2011, at 02:27, Brian Fitzpatrick wrote: > >> Thanks for the tips on the replication factor. Any thoughts on the >> number of nodes in a cluster to support an RF=3 with a workload of 400 >> ops/sec (4-8K sized rows, 50/50 read/write)? Based on the "sweet >> spot" hardware referenced in the wiki (8-core, 16-32GB RAM), what kink >> of ops/sec could I reasonably expect from each node. Just looking for >> a range to make some educated guesses. >> >> Thanks, >> Brian >> >> On Wed, Mar 23, 2011 at 9:04 PM, aaron morton <aa...@thelastpickle.com> >> wrote: >>> It really does depend on what your workload is like, and in the end will >>> involve a certain amount of fudge factor. >>> >>> http://wiki.apache.org/cassandra/CassandraHardware provides some guidance. >>> http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a >>> rough idea of the memory requirements. Note that secondary indexes are also >>> CF's with the same memory settings as the parent. >>> With RF3 you can lose afford to lose one replica for a key a token range and >>> still be available (Assuming Quorum CL). With RF 5 you can lose 2 replicas >>> and still be available for the keys in the range. >>> I'm been careful to say "lose X replicas" because the other nodes in the >>> cluster don't count when considering an operation for a key. Two examples, 9 >>> node cluster with RF3. If you lose nodes 2 and 3 and they are replicas for >>> node 1, Quorum operations on keys in the range for node 1 will fail (ranges >>> for 2 and 3 will be ok). If you lose nodes 2 and 5 Quorum operations will >>> succeed for all keys. >>> RF 3 is reasonable starting point for some redundancy, RF 5 is more. After >>> that it's Web Scale (tm). >>> Hope that helps >>> Aaron >>> >>> On 24 Mar 2011, at 04:04, Brian Fitzpatrick wrote: >>> >>> I'm going through the process of specing out the hardware for a >>> Cassandra cluster. The relevant specs: >>> >>> - Support 460 operations/sec (50/50 read/write workload). Row size >>> ranges from 4 to 8K. >>> - Support 29 million objects for the first year >>> - Support 365 GB storage for the first year, based on Cassandra tests >>> (data + index + overhead * replication factor of 3) >>> >>> I'm looking for advice on the node size for this cluster, recommended >>> RAM per node, and whether RF=3 seems to be a good choice for general >>> availability and resistance to failure. >>> >>> I've looked at the YCSB benchmark paper and through the archives of >>> this email list looking for pointers. I haven't found any general >>> guidelines on recommended cluster size to support X operations/sec >>> with Y data size at RF factor of Z, that I could extrapolate from. >>> >>> Any and all recommendations appreciated. >>> >>> Thanks, >>> Brian >>> >>> >