Distinguished Colleagues:
Our current Cassandra cluster on AWS looks like this:
3 nodes in N. Virginia, one per zone.
RF=3
Each node is a c3.4xlarge with 2x160G SSDs in RAID-0 (~300 Gig SSD on
each node). Works great, I find it the most optimal configuration for a
Cassandra node.
But the time is coming soon when I need to expand storage capacity.
I have the following options in front of me:
1) Add 3 more c3.4xlarge nodes. This keeps the amount of data on each
node reasonable, and all repairs and other tasks can complete in a
reasonable amount of time. The downside is that c3.4xlarge are pricey.
2) Add provisioned EBS volumes. These days I can get SSD-backed EBS
with up to 4000 IOPS provisioned. I can add those volumes to
"data_directories" list in Yaml, and I expect Cassandra can deal with
that JBOD-style.... The upside is that it is much cheaper than option
#1 above; the downside is that it is a much slower configuration and
repairs can take longer.
I'd appreciate any input on this topic.
Thanks in advance,
Oleg