Regarding what Netflix does, the last time I checked: 1) sure, they use AWS VMs, but they take the whole machine. So is that really using a VM? :)
2) they use SSD mainly to reduce compaction time. "We don't even notice it with SSD any more." When sizing nodes and clusters, the main factors I've seen are: a) What read latency are you trying to achieve? With 400 GB data per node, 10 ms is easy, but 1 ms is hard. Your whole design will revolve around this if you want low latency. b) How much data load per node is there? Bootstrapping and backup/restore gets time-consuming and hard with more than 400 GB per node. c) Are you planning to delete data? If so, that's harder to manage. Other than that, the previous comments on RAM are pretty accurate. I would want more cores with vnodes to do more parallel operations. Thanks, James Briggs. -- Cassandra/MySQL DBA. Available in San Jose area or remote. ________________________________ From: Robert Coli <rc...@eventbrite.com> To: "user@cassandra.apache.org" <user@cassandra.apache.org> Sent: Tuesday, September 9, 2014 2:44 PM Subject: Re: hardware sizing for cassandra On Tue, Sep 9, 2014 at 2:16 PM, Russell Bradberry <rbradbe...@gmail.com> wrote: Because RAM is expensive and the JVM heap is limited to 8gb. While you do get benefit out of using extra RAM as page cache, it's often not cost efficient to do so > > >Again, this is so use-case dependent. I have met several people that run small >nodes with fat ram to get it all in memory to serve things in as few >milliseconds as possible. This is a very common pattern in ad-tech where >every millisecond counts. The tunable consistency and cross-datacenter >replication make Cassandra very appealing as it is difficult to set this up >with other DBs. Sure, it's also very common to run RDBMS in such a mode that hundreds of gigabytes of RAM are available as either page cache or buffer pool. But "things are fast when you don't access slow disks" is not really a commentary on Cassandra specifically, "8gb is the largest practical heap size with CMS GC" is.. :D The recommended setup is 3 nodes and an RF of 3 to be able to make quorum reads/writes and survive an outage. But again, this is completely use-case dependent. IMO, minimum number of nodes you actually want to use in production with RF=3 is >=4, probably closer to 6. But as you say, use case dependent. =Rob