Re: Sizing a Cassandra cluster

aaron morton Thu, 24 Mar 2011 14:14:42 -0700

Ops you're right, off by 10.

should be 12,800 write and 3,200 read.


Will also take the opportunity again to say this are just "some numbers" that 
may help when understanding how your app will behave when moving to new HW. And 
that there are a lot of other things the nodes have to do (like compactions, 
handling connections, repairs, GC) that take up resources as well . 

Thanks
Aaron

On 25 Mar 2011, at 10:04, Jose Juarez-Comboni wrote:

> 
> Aaron,
> 
> How did you get to 1280 writes/sec? Counting 64 writers each taking 5ms for a 
> write cycle, assuming real parallel access with no speed hits, I get 12,800 
> writes/sec. Am I missing something?
> 
> From Jose's iPhone
> 
> On Mar 24, 2011, at 2:52 PM, aaron morton <aa...@thelastpickle.com> wrote:
> 
>> Big old guess of something in the 1000's. 
>> 
>> Try benchmarking your work load and plug the numbers (my 5m is pretty high) 
>> in...
>> 
>> - 8 cores * 8 writers per core = 64 if each write request takes 5ms  = 1280 
>> max per sec
>> - 1 spindle * 16 readers per spindle = 16 readers if each read request takes 
>> 5ms =  320 max per sec
>> (reader and writer sizes from the help in conf/cassandra.yaml)
>> 
>> This is really just a guess, there are a lot more things going on in the 
>> system and it gets even more complicated once it's turned on. But I know 
>> sometimes you just need to show you've thought about it :)
>> 
>> Hope that helps.
>> Aaron
>> 
>> On 25 Mar 2011, at 02:27, Brian Fitzpatrick wrote:
>> 
>>> Thanks for the tips on the replication factor.  Any thoughts on the
>>> number of nodes in a cluster to support an RF=3 with a workload of 400
>>> ops/sec (4-8K sized rows, 50/50 read/write)?  Based on the "sweet
>>> spot" hardware referenced in the wiki (8-core, 16-32GB RAM), what kink
>>> of ops/sec could I reasonably expect from each node.  Just looking for
>>> a range to make some educated guesses.
>>> 
>>> Thanks,
>>> Brian
>>> 
>>> On Wed, Mar 23, 2011 at 9:04 PM, aaron morton <aa...@thelastpickle.com> 
>>> wrote:
>>>> It really does depend on what your workload is like, and in the end will
>>>> involve a certain amount of fudge factor.
>>>> 
>>>> http://wiki.apache.org/cassandra/CassandraHardware provides some guidance.
>>>> http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a
>>>> rough idea of the memory requirements. Note that secondary indexes are also
>>>> CF's with the same memory settings as the parent.
>>>> With RF3 you can lose afford to lose one replica for a key a token range 
>>>> and
>>>> still be available (Assuming Quorum CL). With RF 5 you can lose 2 replicas
>>>> and still be available for the keys in the range.
>>>> I'm been careful to say "lose X replicas" because the other nodes in the
>>>> cluster don't count when considering an operation for a key. Two examples, 
>>>> 9
>>>> node cluster with RF3. If you lose nodes 2 and 3 and they are replicas for
>>>> node 1, Quorum operations on keys in the range for node 1 will fail (ranges
>>>> for 2 and 3 will be ok). If you lose nodes 2 and 5 Quorum operations will
>>>> succeed for all keys.
>>>> RF 3 is reasonable starting point for some redundancy, RF 5 is more. After
>>>> that it's Web Scale (tm).
>>>> Hope that helps
>>>> Aaron
>>>> 
>>>> On 24 Mar 2011, at 04:04, Brian Fitzpatrick wrote:
>>>> 
>>>> I'm going through the process of specing out the hardware for a
>>>> Cassandra cluster. The relevant specs:
>>>> 
>>>> - Support 460 operations/sec (50/50 read/write workload). Row size
>>>> ranges from 4 to 8K.
>>>> - Support 29 million objects for the first year
>>>> - Support 365 GB storage for the first year, based on Cassandra tests
>>>> (data + index + overhead * replication factor of 3)
>>>> 
>>>> I'm looking for advice on the node size for this cluster, recommended
>>>> RAM per node, and whether RF=3 seems to be a good choice for general
>>>> availability and resistance to failure.
>>>> 
>>>> I've looked at the YCSB benchmark paper and through the archives of
>>>> this email list looking for pointers.  I haven't found any general
>>>> guidelines on recommended cluster size to support X operations/sec
>>>> with Y data size at RF factor of Z, that I could extrapolate from.
>>>> 
>>>> Any and all recommendations appreciated.
>>>> 
>>>> Thanks,
>>>> Brian
>>>> 
>>>> 
>>

Re: Sizing a Cassandra cluster

Reply via email to