Re: Request for Thoughts on Deployments on AWS EC2 vs. ECS

Joe Obernberger Thu, 12 Jun 2025 11:08:52 -0700

What operator are you all using? We've just been using statefulsets forour clusters. I'm a big time on-hardware fan, but an issue withCassandra is the notion of one JVM per about 1 to 2TBytes of diskspace. Most large servers are in the 256 core+ / 100+TBytes of disk. Managing that many instances of cassandra on a single node is painful. Kubernetes solves that. Want to scale up?

kubectl scale statefulset --n cassandra cassandra -replicas=48

or whatever. Doing a rolling restart is easy. On 'large' deploymentsof over 500TBytes of disk - I'm not sure how this can be easily managedwithout kubernetes. How is it done?

As to persistent storage; yes, I miss the days of Hadoop and HDFS! Buthere we are....most things seem to be going down the path of storage isnetwork device, and no longer local. I don't like it either, but thereare certainly management and ease-of-use considerations.


-Joe

On 6/12/2025 10:15 AM, Jon Haddad wrote:

I agree that managing Cassandra on Kubernetes can be challengingwithout prior experience, as understanding all the nuances ofKubernetes takes time.

However, there are ways to address the rescheduling issues, nodeplacement, and local disk concerns that were mentioned. You can pinpods to specific hosts to avoid rescheduling on different nodes, andyou can use local disks or a combination of persistent disks with alocal NVMe as a cache. Host networking or (i think) Cillium can helpwith the networking performance concerns. For most arguments againstusing Kubernetes, there's usually a workaround or setting that canaddress the issue.

The main advantage of Kubernetes is the operator. While it has somequirks, it generally does a good job of managing your deployment,eliminating the need to write all your workflows. Building onKubernetes as a standard offers the advantage of applying yourknowledge across various environments once you're familiar with it.

I wouldn't recommend jumping into Kubernetes and Cassandrasimultaneously. Both are complex topics. I've worked with Cassandrafor over a decade and Kubernetes on and off for five years, and Istill encounter challenges, especially when my desired outcome differsfrom the operator's.

Both versions are workable. Both have tradeoffs. For now, I'm alsosticking to baking AMIs [3], but with more experience on K8 and alittle more maturity from Cassandra, I'd think differently. Forstateless apps, I'm 100% on board with K8.

Jon

[1]https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-36%3A+A+Configurable+ChannelProxy+to+alias+external+storage+locations

[2] https://lists.apache.org/thread/r0nhyyn6mbpy55fl90xqcj17v6w3wxg3
[3] https://github.com/rustyrazorblade/easy-cass-lab/tree/main/packer

On Thu, Jun 12, 2025 at 6:17 AM Luciano Greiner<luciano.grei...@gmail.com> wrote:


    Quick correction on my previous message — I assumed you were referring
    to running Cassandra on Kubernetes, not purely ECS.

    Many of the same concerns still apply. ECS tasks can also be
    rescheduled or moved between instances, which poses risks for
    Cassandra’s rack awareness and replica distribution. Ensuring stable
    node identity and local storage is still tricky.

    Cassandra works best when it's tightly coupled to its hardware —
    ideally on dedicated VMs or bare metal — where you have full control
    over topology and disk performance.

    Luciano Greiner

    On Thu, Jun 12, 2025 at 10:13 AM Luciano Greiner
    <luciano.grei...@gmail.com> wrote:
    >
    > I usually advise against running Cassandra (or most databases)
    inside
    > Kubernetes. It might look like it simplifies operations, but in my
    > experience, it tends to introduce more complexity than it solves.
    >
    > With Cassandra specifically, Kubernetes may reschedule pods for
    > reasons outside your control (e.g., node pressure, restarts,
    > upgrades). This can lead to topology violations — for example, all
    > replicas ending up in the same physical rack, defeating the
    purpose of
    > proper rack and datacenter awareness.
    >
    > Another major issue is storage. Cassandra expects fast, local disks
    > close to the compute layer. While Kubernetes StatefulSets can use
    > PersistentVolumes, these are often network-attached and may not
    offer
    > the performance or locality guarantees Cassandra needs. And if your
    > pods get rescheduled, depending on your storage class and cloud
    > provider, you may run into delays or errors reattaching volumes.
    >
    > Using an operator like K8ssandra doesn't necessarily eliminate these
    > problems — it just adds another tool to manage within the puzzle.
    >
    > Luciano Greiner
    >
    > On Thu, Jun 12, 2025 at 6:20 AM Dor Laor via user
    > <user@cassandra.apache.org> wrote:
    > >
    > > It's possible to manage Cassandra well both with VMs and
    containers.
    > > As you'd be running one container per VM, there is no
    significant advantage for
    > > containers. K8s provides nice tooling and some methodological
    enforcement which
    > > brings order to the setup but if the team aren't top notch
    experts in k8s, it's not worth
    > > the trouble and the limitations that come with it (networking
    outside the k8s cluster, etc).
    > > It's good to have fewer layers. Most users run databases
    outside of containers.
    > >
    > > On Wed, Jun 11, 2025 at 11:36 PM Raymond Yu
    <rayyu...@gmail.com> wrote:
    > >>
    > >> Hi Cassandra community,
    > >>
    > >> I would like to ask for your expert opinions regarding a
    discussion we're having about deploying Cassandra on AWS EC2 vs.
    AWS ECS. For context, we have a small dedicated DB engineering
    team that is familiar with operating and supporting Cassandra on
    EC2 for many customer teams. However, one team has developed
    custom tooling for operating Cassandra on ECS (EC2-backed) and
    would like for us to migrate to it for their Cassandra needs,
    which has spawned this discussion (K8ssandra was considered, but
    that team did not want to use Kubernetes).
    > >>
    > >> Further context on our team and experience:
    > >> - Small dedicated team supporting Cassandra (and other DBs)
    > >> - Familiar with operating EC2 on Cassandra
    > >> - Familiar with standard IaC tools and languages
    (Ansible/Terraform/Python/etc.)
    > >> - Only deploy in AWS
    > >>
    > >> Discussed points regarding staying with EC2:
    > >> - Existing team experience and automation in deploying
    Cassandra on EC2
    > >> - Simpler solution is easier to support and maintain
    > >> - Almost all documentation we can find and use is specific to
    deploying on EC2
    > >> - Third party support is familiar with EC2 by default
    > >> - Lower learning curve is lower for engineers to onboard
    > >> - More hands-on maintenance regarding OS upgrades
    > >> - Less modern solution
    > >>
    > >> Discussed points regarding using the new ECS solution:
    > >> - Containers are the more modern solution
    > >> - Node autoheal feature in addition to standard C* operations
    via a control plane
    > >> - Higher tool and architecture complexity that requires
    ramp-up in order to use and support effectively
    > >> - We're on our own for potential issues with the tool itself
    after it would be handed off
    > >> - No demonstrated performance gain over EC2-based clusters
    > >> - Third-party support would be less familiar with dealing
    with ECS issues
    > >> - Deployed on EC2 under the hood (one container per VM), so
    the underlying architecture is the same between both solutions
    > >>
    > >> Given that context, our team generally feels that there is
    little marginal benefit given the cost of ramp up and supporting a
    custom tool, but there has also been a request for harder evidence
    and outside opinions on the topic. It has been hard to find
    documentation of this specific comparison on EC2 vs ECS to
    reference. We'd love to hear your thoughts on our context, but
    also are interested in any general recommendations for one over
    the other. Thanks in advance!
    > >>
    > >> Best,
    > >> Raymond Yu


--
This email has been checked for viruses by AVG antivirus software.
www.avg.com

Re: Request for Thoughts on Deployments on AWS EC2 vs. ECS

Reply via email to