It will also depend on how long you can handle recovery time. So imagine this case:
3 nodes w/ RF of 3 Each node has 30TB of space used (you never want to fill up entire node). If one node fails and you must recover, that will take over 3.6 days in just transferring data alone. That's with a sustained 800megabit/s (100MB/s). In the real world it's going to fluctuate so add some padding. Also, since you will be saturating one of the other nodes, now you're network latency performance is suffering and you only have 1 machine to handle the remaining traffic while you're recovering. And if you want to expand the cluster in the future (more nodes), the amount of data to transfer is going to be very large and most likely days to add machines. >From my experience it's must better to have a larger cluster setup upfront for future growth than getting by with 6-12 nodes at the start. You will feel less pain, easier to manage node failures (bad disks, mem, etc). 3 nodes with RF of 1 wouldn't make sense. On Sat, Sep 3, 2011 at 4:05 AM, China Stoffen <chinastof...@yahoo.com>wrote: > Many small servers would drive up the hosting cost way too high so want to > avoid this solution if we can. > > ----- Original Message ----- > From: Radim Kolar <h...@sendmail.cz> > To: user@cassandra.apache.org > Cc: > Sent: Saturday, September 3, 2011 9:37 AM > Subject: Re: commodity server spec > > many smaller servers way better >