Thanks for the feedback. By "small" I mean that currently I have a 6x m1.xlarge instances running Cassandra 3.0.17. Total amount of data is around 1.5TB spread across couple of keypaces wih RF:3.
Over time few things happened/became clear including: - increase amount of ingested data - m1.xlarge instances are somehow outdated. We noted that one of them is under performing compared to the others. Networking is not always stable/reliable and so on - Upgrading from 3.0.6 to 3.0.17 emphasized the need of better hardware even more (in my opinion). Starting from here I believe that i3/r5d are already a much better option to what we have with a comparable price. About the EBS: Yes, I am aware its performance is related to its size (and type) That is the reason why I was looking into a 600/900GB drive that already a much better option compared to our raid0 of spinning disks. Both i3 and r5d are EBS optimized Regards, On Mon, Dec 10, 2018 at 2:38 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Mon, Dec 10, 2018 at 12:20 PM Riccardo Ferrari <ferra...@gmail.com> > wrote: > >> I am wondering what instance type is best for a small cassandra cluster >> on AWS. >> > > Define "small" :-D > > >> Actually I'd like to compare, or have your opinion about the following >> instances: >> >> - r5*d*.xlarge (4vCPU, *19*ecu, 32GB ram and 1 NVMe instance store >> 150GB) >> - Need to attach a 600/900GB ESB >> - i3.xlarge (4vCPU, *13ecu, *30.5GB ram and 9.5TB NVMe instance >> store) >> >> Both have up to 10Gb networking. >> I see AWS mark i3 as the NoSQL DB instances nevertheless r5d seems bit >> better CPU wise. Putting a decently sized gp2 EBS I should have enough IOPS >> especially we think to put commitlog and such on the 150GB NVMe storage. >> About the workload: mostly TWCS inserts and upserts on LCS. >> > > So there is a number of trade-offs: > > 1. With EBS you have more flexibility when it comes to scaling compute > power: you don't have to rebuild data directory from scratch. At the same > time, EBS performance can be limited by the volume itself (it depends on > volume type *and* size), and it can also be limited by instance type. You > might not be able to reach max throughput of a big volume with a small > instance attached. > > 2. I didn't try to run Cassandra with i2 or i3 instances. These are > optimized for a lot of random IO, though with Cassandra what you should be > seeing is mostly sequential IO, so I'm not sure you're going to utilize the > NVMes fully. Some AWS features, like auto-recovery, only work with > instances using EBS-backed storage exclusively. > > Cheers, > -- > Alex > >