Hi everyone,

Thanks for the discussion and feedback so far, it has been very helpful!

I have updated the doc with some of the suggestions and follow-up work
discussed on this thread.

Let me know if you have any remaining questions or comments before we bring
this to a vote.

Cheers,

Paulo

On Thu, Mar 26, 2026 at 3:30 PM Paulo Motta <[email protected]> wrote:

> Hi Joel,
>
> Thanks for the meaningful feedback, see follow up below:
>
> >  I think I'm still trying to form a mental model of where Sidecar's
> responsibilities start and end. It sounds like for this proposal, its scope
> is basically applying the "current" configuration to the node(s), but the
> configuration management itself needs to be done above Sidecar.
>
> Yes, the main goal of this is to add awareness of the node configuration
> to sidecar so we can start exploring things like rolling config updates and
> config drift detection within sidecar, but we still need an external
> orchestrator to bootstrap the config.
>
> >  With the introduction of hashes, you may have a path towards hardening
> this by enabling the node to validate the integrity of its configuration at
> runtime, which will be more useful as Sidecar makes automated deployments
> of configuration easier.
>
> I agree!  This CEP is a step in this direction, since it will allow
> operators to express the desired config and we can compare with the running
> config from the `system_views.settings` virtual table to detect and report
> any discrepancies.
>
> > Is there a write-up somewhere of how Sidecar auth works end-to-end?
>
> I don't think there is a formal doc available but you can find more
> information in the PR that introduced it:
> - https://issues.apache.org/jira/browse/CASSSIDECAR-161
> -
> https://github.com/apache/cassandra-sidecar/commit/5a19e3448038fa4b2e9f497ab94dbbe911f44c29
>
> >  Not directly related: in that section it says "Sidecar does not depend
> on Cassandra being running", but in the auth section it says "Permissions
> are resolved from Sidecar's sidecar_internal.role_permissions_v1 tables".
> Can Sidecar access tables on the node directly, without going through the
> main Cassandra process?
>
> That's a good point. When the managed node is offline it's possible to
> access authorization tables through other nodes, by specifying multiple CQL
> contact points in sidecar.yml. Alternatively it's possible to set up
> admin_identities in sidecar.yml allowing admins to perform config changes
> in the event that auth tables are unavailable. I have updated the
> authorization section with a note about this.
>
> Cheers,
>
> Paulo
>
> On Wed, Mar 25, 2026 at 6:16 PM Joel Shepherd <[email protected]> wrote:
>
>> Thanks, Paulo - I think I'm still trying to form a mental model of where
>> Sidecar's responsibilities start and end. It sounds like for this proposal,
>> it's scope is basically applying the "current" configuration to the
>> node(s), but the configuration management itself needs to be done above
>> Sidecar. Makes sense.
>>
>> A couple questions below for my own learning, and a response as well...
>> On 3/18/2026 3:14 PM, Paulo Motta wrote:
>>
>> Thanks for the feedback Joel! See follow-up below:
>>
>> > * Authorization - What are the authorization controls.
>>
>> Good call! This will use Sidecar's existing authorization mechanism per
>> HTTP endpoint. Two new permissions will be added: CONFIGURATION:READ and
>> CONFIGURATION:MODIFY to control reading or updating configs. I've updated
>> the doc with a new section about authorization.
>>
>> Is there a write-up somewhere of how Sidecar auth works end-to-end?
>>
>>
>> > * Integrity - I see you're using hashes for conflict detection. Have you
>> considered using them as integrity checks as well: e.g., to guarantee
>> the configuration deployed for the node/instance to load at runtime is
>> the same configuration computed by the configuration manager (I think
>> that's the right component)?
>>
>> The deployed configuration is guaranteed to be the same configuration
>> computed by the configuration manager, since the runtime configuration is
>> always refreshed during instance startup. If the runtime configuration is
>> corrupted it would be overwritten by the recomputed config. Let me know if
>> this cover the scenario you have in mind or if I missed something.
>>
>> I guess I was thinking a little beyond the scope of this change, to: how
>> does the node know that the configuration hasn't been altered from what
>> it's intended to be, regardless of whether that's through cosmic ray
>> flipping a bit, or a misbehaving file system corrupting bytes, or manual
>> modification? With the introduction of hashes, you may have a path towards
>> hardening this by enabling the node to validate the integrity of its
>> configuration at runtime, which will be more useful as Sidecar makes
>> automated deployments of configuration easier. Color me paranoid.
>>
>> But, I acknowledge this is outside the scope of this CEP.
>>
>> > * Rollback/forward - If I push a bad configuration change, how do I as
>> the the administrator respond to that?
>>
>> Change tracking is explicitly not a goal of this CEP to keep the scope
>> limited. When a bad configuration is pushed, the operator would need to
>> manually revert by submitting another PATCH request undoing the bad
>> configuration. An external RCS would need to be used to keep track of
>> config history if needed. I've added a new future work entry to support
>> change tracking natively. I've also added an Operational Guide section with
>> an overview of how this is expected to be used. Let me know if this makes
>> sense.
>>
>> Thanks so much for doing that: it's really helpful to talk about how the
>> user will work with the feature up-front.
>>
>> Not directly related: in that section it says "Sidecar does not depend on
>> Cassandra being running", but in the auth section it says "Permissions are
>> resolved from Sidecar's sidecar_internal.role_permissions_v1 tables".
>> Can Sidecar access tables on the node directly, without going through the
>> main Cassandra process?
>>
>> Thanks again -- Joel.
>>
>>
>>
>> On Tue, 17 Mar 2026 at 17:04 Joel Shepherd <[email protected]> wrote:
>>
>>> Hi Paulo - Interesting CEP, and potentially very useful: thanks!
>>>
>>> I was wondering about several things as I was reading through it:
>>>
>>> * Authorization - Particularly for operations that mutate configuration
>>> (either in the store or at run-time for the node). What are the
>>> authorization controls.
>>>
>>> * Integrity - I see you're using hashes for conflict detection. Have you
>>> considered using them as integrity checks as well: e.g., to guarantee the
>>> configuration deployed for the node/instance to load at runtime is the same
>>> configuration computed by the configuration manager (I think that's the
>>> right component)? This would be a guard against bugs, network gremlins,
>>> file system gremlins, etc., quietly corrupting the configuration that the
>>> node will eventually read.
>>>
>>> * Visibility - As an 'administrator' how do I determine how much of my
>>> cluster is running on the latest configuration, and which nodes
>>> specifically aren't? Is it up to me to implement that monitoring?
>>>
>>> * Rollback/forward - If I push a bad configuration change, how do I as
>>> the the administrator respond to that? For example, is there an assumption
>>> that I'll be managing my configuration in a RCS somewhere and will be
>>> expected to quickly retrieve a known-good older revision from it and push
>>> it through sidecar? It might be helpful to have a "user experience" section
>>> in the CEP to describe how you envision users managing their cluster's
>>> configuration through this tool: what they're responsible for, what the
>>> tool is responsible for.
>>>
>>> Thanks -- Joel.
>>> On 3/17/2026 9:32 AM, Paulo Motta wrote:
>>>
>>>
>>> Hi everyone,
>>>
>>> I'd like to propose CEP-62: Cassandra Configuration Management via
>>> Sidecar for discussion by the community.
>>>
>>> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management
>>> capabilities to Sidecar, giving operators the ability to start and stop
>>> Cassandra instances programmatically. However, Sidecar currently has no way
>>> to manipulate the configuration files that those instances consume at
>>> startup.
>>>
>>> Many Cassandra settings (memtable configuration, SSTable settings,
>>> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and
>>> must be set in cassandra.yaml or JVM options files, requiring a restart to
>>> take effect. Managing these files manually or through custom tooling is
>>> cumbersome and lacks a stable API.
>>>
>>> This CEP extends Sidecar's lifecycle management by adding configuration
>>> management capabilities for persisted configuration artifacts. It
>>> introduces a REST API for reading and updating cassandra.yaml and JVM
>>> options, a pluggable ConfigurationProvider abstraction for integration with
>>> centralized configuration systems (etcd, Consul, or custom backends), and
>>> version-aware validation to prevent startup failures.
>>>
>>> This CEP also serves as a prerequisite for future Cassandra upgrades via
>>> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires
>>> updating storage_compatibility_mode in cassandra.yaml. The configuration
>>> management capabilities introduced here will enable Sidecar to orchestrate
>>> such upgrades by updating configuration artifacts alongside binary version
>>> changes.
>>>
>>> The CEP is linked here:
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
>>>
>>> Looking forward to your feedback!
>>>
>>> Thanks,
>>>
>>> Paulo
>>>
>>> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266
>>>
>>>

Reply via email to