Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Joel Shepherd Mon, 30 Mar 2026 10:59:36 -0700

On 3/26/2026 12:30 PM, Paulo Motta wrote:

> Is there a write-up somewhere of how Sidecar auth works end-to-end?
I don't think there is a formal doc available but you can find moreinformation in the PR that introduced it:
- https://issues.apache.org/jira/browse/CASSSIDECAR-161
-https://github.com/apache/cassandra-sidecar/commit/5a19e3448038fa4b2e9f497ab94dbbe911f44c29


Thanks - I'll give those a read.

> Not directly related: in that section it says "Sidecar does notdepend on Cassandra being running", but in the auth section it says"Permissions are resolved from Sidecar's|sidecar_internal.role_permissions_v1| tables". Can Sidecar accesstables on the node directly, without going through the main Cassandraprocess?
That's a good point. When the managed node is offline it's possible toaccess authorization tables through other nodes, by specifyingmultiple CQL contact points in sidecar.yml. Alternatively it'spossible to set up admin_identities in sidecar.yml allowing admins toperform config changes in the event that auth tables are unavailable.I have updated the authorization section with a note about this.


Thanks for doing that.

One thing to maybe be intentional about avoiding is creating adependency loop where a cluster gets into a state where it can't startwithout sidecar, and sidecar can't start without the cluster. It'sprobably more important to avoid dependencies in Cassandra that wouldprevent a cluster from cold-starting. Imagine a rack or datacenterlosing power and a cluster having to come back online as quickly aspossible. Ideally, it should be more or less automatic as the serversreboot but if a circular dependency is introduced, recovery from thatkind of outage could be difficult.

Let me know if you have any remaining questions or comments before webring this to a vote.


None of my comments above should be in the way of that.

Thanks -- Joel.

On Wed, Mar 25, 2026 at 6:16 PM Joel Shepherd <[email protected]> wrote:

    Thanks, Paulo - I think I'm still trying to form a mental model of
    where Sidecar's responsibilities start and end. It sounds like for
    this proposal, it's scope is basically applying the "current"
    configuration to the node(s), but the configuration management
    itself needs to be done above Sidecar. Makes sense.

    A couple questions below for my own learning, and a response as
    well...

    On 3/18/2026 3:14 PM, Paulo Motta wrote:

    Thanks for the feedback Joel! See follow-up below:

    > * Authorization - What are the authorization controls.

    Good call! This will use Sidecar's existing authorization
    mechanism per HTTP endpoint. Two new permissions will be added:
    CONFIGURATION:READ and CONFIGURATION:MODIFY to control reading or
    updating configs. I've updated the doc with a new section about
    authorization.


    Is there a write-up somewhere of how Sidecar auth works end-to-end?


    > * Integrity - I see you're using hashes for conflict detection.
    Have you
    considered using them as integrity checks as well: e.g., to guarantee
    the configuration deployed for the node/instance to load at
    runtime is
    the same configuration computed by the configuration manager (I think
    that's the right component)?

    The deployed configuration is guaranteed to be the same
    configuration computed by the configuration manager, since the
    runtime configuration is always refreshed during instance
    startup. If the runtime configuration is corrupted it would be
    overwritten by the recomputed config. Let me know if this cover
    the scenario you have in mind or if I missed something.


    I guess I was thinking a little beyond the scope of this change,
    to: how does the node know that the configuration hasn't been
    altered from what it's intended to be, regardless of whether
    that's through cosmic ray flipping a bit, or a misbehaving file
    system corrupting bytes, or manual modification? With the
    introduction of hashes, you may have a path towards hardening this
    by enabling the node to validate the integrity of its
    configuration at runtime, which will be more useful as Sidecar
    makes automated deployments of configuration easier. Color me
    paranoid.

    But, I acknowledge this is outside the scope of this CEP.

    > * Rollback/forward - If I push a bad configuration change, how
    do I as
    the the administrator respond to that?

    Change tracking is explicitly not a goal of this CEP to keep the
    scope limited. When a bad configuration is pushed, the operator
    would need to manually revert by submitting another PATCH request
    undoing the bad configuration. An external RCS would need to be
    used to keep track of config history if needed. I've added a new
    future work entry to support change tracking natively. I've also
    added an Operational Guide section with an overview of how this
    is expected to be used. Let me know if this makes sense.


    Thanks so much for doing that: it's really helpful to talk about
    how the user will work with the feature up-front.

    Not directly related: in that section it says "Sidecar does not
    depend on Cassandra being running", but in the auth section it
    says "Permissions are resolved from Sidecar's
    |sidecar_internal.role_permissions_v1| tables". Can Sidecar access
    tables on the node directly, without going through the main
    Cassandra process?

    Thanks again -- Joel.


    On Tue, 17 Mar 2026 at 17:04 Joel Shepherd <[email protected]>
    wrote:

        Hi Paulo - Interesting CEP, and potentially very useful: thanks!

        I was wondering about several things as I was reading through it:

        * Authorization - Particularly for operations that mutate
        configuration (either in the store or at run-time for the
        node). What are the authorization controls.

        * Integrity - I see you're using hashes for conflict
        detection. Have you considered using them as integrity checks
        as well: e.g., to guarantee the configuration deployed for
        the node/instance to load at runtime is the same
        configuration computed by the configuration manager (I think
        that's the right component)? This would be a guard against
        bugs, network gremlins, file system gremlins, etc., quietly
        corrupting the configuration that the node will eventually read.

        * Visibility - As an 'administrator' how do I determine how
        much of my cluster is running on the latest configuration,
        and which nodes specifically aren't? Is it up to me to
        implement that monitoring?

        * Rollback/forward - If I push a bad configuration change,
        how do I as the the administrator respond to that? For
        example, is there an assumption that I'll be managing my
        configuration in a RCS somewhere and will be expected to
        quickly retrieve a known-good older revision from it and push
        it through sidecar? It might be helpful to have a "user
        experience" section in the CEP to describe how you envision
        users managing their cluster's configuration through this
        tool: what they're responsible for, what the tool is
        responsible for.

        Thanks -- Joel.

        On 3/17/2026 9:32 AM, Paulo Motta wrote:


        Hi everyone,

        I'd like to propose CEP-62: Cassandra Configuration
        Management via Sidecar for discussion by the community.

        CASSSIDECAR-266[1] introduced Cassandra process lifecycle
        management capabilities to Sidecar, giving operators the
        ability to start and stop Cassandra instances
        programmatically. However, Sidecar currently has no way to
        manipulate the configuration files that those instances
        consume at startup.

        Many Cassandra settings (memtable configuration, SSTable
        settings, storage_compatibility_mode) cannot be modified at
        runtime via JMX/CQL and must be set in cassandra.yaml or JVM
        options files, requiring a restart to take effect. Managing
        these files manually or through custom tooling is cumbersome
        and lacks a stable API.

        This CEP extends Sidecar's lifecycle management by adding
        configuration management capabilities for persisted
        configuration artifacts. It introduces a REST API for
        reading and updating cassandra.yaml and JVM options, a
        pluggable ConfigurationProvider abstraction for integration
        with centralized configuration systems (etcd, Consul, or
        custom backends), and version-aware validation to prevent
        startup failures.

        This CEP also serves as a prerequisite for future Cassandra
        upgrades via Sidecar. For example, upgrading from Cassandra
        4 to Cassandra 5 requires updating
        storage_compatibility_mode in cassandra.yaml. The
        configuration management capabilities introduced here will
        enable Sidecar to orchestrate such upgrades by updating
        configuration artifacts alongside binary version changes.

        The CEP is linked here:
        
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar

        Looking forward to your feedback!

        Thanks,

        Paulo

        [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266

Re: [DISCUSS] CEP-62: Cassandra Configuration Management via Sidecar

Reply via email to