Every node should have at least 4 cores, with a maximum of 8. Memory shouldn't 
be higher than 32g, 16gb is good for a start. Every node should be a phisical 
machine, not a virtual one, or at least a virtual machine with an ssd hd 
subsystem. The disk subsystem should be directly connected to the machine, no 
sans or fiber channel between. Cassandra is cpu and io bounded, so you should 
get the maximum io speed and a reasonable number of cores.

Number of nodes should be 3 at least with replication factor of 2. You should 
prefer more less powerful nodes then fewer more powerful nodes.

Disk size depends on your workload, although you should always keep 50% of the 
disk free in the case repair sessions requires space, or perform sub range 

In my experience a 1GB link between nodes is ok, but the less lag the better.

Summing up if you need to save some money, get 4 cores and 16 gb or ram, 32 is 
rarely needed and 64 a waste. 8 cores would probably be too much with 1000 
writes a second.


Paolo Crosato
Software engineer/Custom Solutions

Da: Chris Lohfink <clohf...@blackbirdit.com>
Inviato: martedì 9 settembre 2014 21.26
A: user@cassandra.apache.org
Oggetto: Re: hardware sizing for cassandra

It depends.  Ultimately your load is low enough a single node can probably 
handle it so you kinda want a "minimum" cluster.  Different people have 
different thoughts on what this means - I would recommend 5-6 nodes with a 3 
replication factor.  (say m1.xlarge, or c3.2xlarge striped ephemerals, I like 
i2's but kinda overkill here).  Nodes with less then 16gb of ram wont last long 
so should really start around there.


On Sep 9, 2014, at 11:02 AM, Oleg Ruchovets <oruchov...@gmail.com> wrote:

> Hi ,
>    Where can I find the document with best practices about sizing for 
> cassandra deployment?
>    We have 1000 writes / reads per second. record size 1k.
> Questions:
>    1) how many machines do we need?
>    2) how many ram ,disc size / type?
>    3) What should be network?
> I understand that hardware is very depends on data distribution and access 
> pattern and other criteria, but I still want to believe that there is a best 
> practice :-)
> Thanks
> Oleg.

Reply via email to