Hi everyone, Thanks for the discussion and feedback so far, it has been very helpful!
I have updated the doc with some of the suggestions and follow-up work discussed on this thread. Let me know if you have any remaining questions or comments before we bring this to a vote. Cheers, Paulo On Thu, Mar 26, 2026 at 3:30 PM Paulo Motta <[email protected]> wrote: > Hi Joel, > > Thanks for the meaningful feedback, see follow up below: > > > I think I'm still trying to form a mental model of where Sidecar's > responsibilities start and end. It sounds like for this proposal, its scope > is basically applying the "current" configuration to the node(s), but the > configuration management itself needs to be done above Sidecar. > > Yes, the main goal of this is to add awareness of the node configuration > to sidecar so we can start exploring things like rolling config updates and > config drift detection within sidecar, but we still need an external > orchestrator to bootstrap the config. > > > With the introduction of hashes, you may have a path towards hardening > this by enabling the node to validate the integrity of its configuration at > runtime, which will be more useful as Sidecar makes automated deployments > of configuration easier. > > I agree! This CEP is a step in this direction, since it will allow > operators to express the desired config and we can compare with the running > config from the `system_views.settings` virtual table to detect and report > any discrepancies. > > > Is there a write-up somewhere of how Sidecar auth works end-to-end? > > I don't think there is a formal doc available but you can find more > information in the PR that introduced it: > - https://issues.apache.org/jira/browse/CASSSIDECAR-161 > - > https://github.com/apache/cassandra-sidecar/commit/5a19e3448038fa4b2e9f497ab94dbbe911f44c29 > > > Not directly related: in that section it says "Sidecar does not depend > on Cassandra being running", but in the auth section it says "Permissions > are resolved from Sidecar's sidecar_internal.role_permissions_v1 tables". > Can Sidecar access tables on the node directly, without going through the > main Cassandra process? > > That's a good point. When the managed node is offline it's possible to > access authorization tables through other nodes, by specifying multiple CQL > contact points in sidecar.yml. Alternatively it's possible to set up > admin_identities in sidecar.yml allowing admins to perform config changes > in the event that auth tables are unavailable. I have updated the > authorization section with a note about this. > > Cheers, > > Paulo > > On Wed, Mar 25, 2026 at 6:16 PM Joel Shepherd <[email protected]> wrote: > >> Thanks, Paulo - I think I'm still trying to form a mental model of where >> Sidecar's responsibilities start and end. It sounds like for this proposal, >> it's scope is basically applying the "current" configuration to the >> node(s), but the configuration management itself needs to be done above >> Sidecar. Makes sense. >> >> A couple questions below for my own learning, and a response as well... >> On 3/18/2026 3:14 PM, Paulo Motta wrote: >> >> Thanks for the feedback Joel! See follow-up below: >> >> > * Authorization - What are the authorization controls. >> >> Good call! This will use Sidecar's existing authorization mechanism per >> HTTP endpoint. Two new permissions will be added: CONFIGURATION:READ and >> CONFIGURATION:MODIFY to control reading or updating configs. I've updated >> the doc with a new section about authorization. >> >> Is there a write-up somewhere of how Sidecar auth works end-to-end? >> >> >> > * Integrity - I see you're using hashes for conflict detection. Have you >> considered using them as integrity checks as well: e.g., to guarantee >> the configuration deployed for the node/instance to load at runtime is >> the same configuration computed by the configuration manager (I think >> that's the right component)? >> >> The deployed configuration is guaranteed to be the same configuration >> computed by the configuration manager, since the runtime configuration is >> always refreshed during instance startup. If the runtime configuration is >> corrupted it would be overwritten by the recomputed config. Let me know if >> this cover the scenario you have in mind or if I missed something. >> >> I guess I was thinking a little beyond the scope of this change, to: how >> does the node know that the configuration hasn't been altered from what >> it's intended to be, regardless of whether that's through cosmic ray >> flipping a bit, or a misbehaving file system corrupting bytes, or manual >> modification? With the introduction of hashes, you may have a path towards >> hardening this by enabling the node to validate the integrity of its >> configuration at runtime, which will be more useful as Sidecar makes >> automated deployments of configuration easier. Color me paranoid. >> >> But, I acknowledge this is outside the scope of this CEP. >> >> > * Rollback/forward - If I push a bad configuration change, how do I as >> the the administrator respond to that? >> >> Change tracking is explicitly not a goal of this CEP to keep the scope >> limited. When a bad configuration is pushed, the operator would need to >> manually revert by submitting another PATCH request undoing the bad >> configuration. An external RCS would need to be used to keep track of >> config history if needed. I've added a new future work entry to support >> change tracking natively. I've also added an Operational Guide section with >> an overview of how this is expected to be used. Let me know if this makes >> sense. >> >> Thanks so much for doing that: it's really helpful to talk about how the >> user will work with the feature up-front. >> >> Not directly related: in that section it says "Sidecar does not depend on >> Cassandra being running", but in the auth section it says "Permissions are >> resolved from Sidecar's sidecar_internal.role_permissions_v1 tables". >> Can Sidecar access tables on the node directly, without going through the >> main Cassandra process? >> >> Thanks again -- Joel. >> >> >> >> On Tue, 17 Mar 2026 at 17:04 Joel Shepherd <[email protected]> wrote: >> >>> Hi Paulo - Interesting CEP, and potentially very useful: thanks! >>> >>> I was wondering about several things as I was reading through it: >>> >>> * Authorization - Particularly for operations that mutate configuration >>> (either in the store or at run-time for the node). What are the >>> authorization controls. >>> >>> * Integrity - I see you're using hashes for conflict detection. Have you >>> considered using them as integrity checks as well: e.g., to guarantee the >>> configuration deployed for the node/instance to load at runtime is the same >>> configuration computed by the configuration manager (I think that's the >>> right component)? This would be a guard against bugs, network gremlins, >>> file system gremlins, etc., quietly corrupting the configuration that the >>> node will eventually read. >>> >>> * Visibility - As an 'administrator' how do I determine how much of my >>> cluster is running on the latest configuration, and which nodes >>> specifically aren't? Is it up to me to implement that monitoring? >>> >>> * Rollback/forward - If I push a bad configuration change, how do I as >>> the the administrator respond to that? For example, is there an assumption >>> that I'll be managing my configuration in a RCS somewhere and will be >>> expected to quickly retrieve a known-good older revision from it and push >>> it through sidecar? It might be helpful to have a "user experience" section >>> in the CEP to describe how you envision users managing their cluster's >>> configuration through this tool: what they're responsible for, what the >>> tool is responsible for. >>> >>> Thanks -- Joel. >>> On 3/17/2026 9:32 AM, Paulo Motta wrote: >>> >>> >>> Hi everyone, >>> >>> I'd like to propose CEP-62: Cassandra Configuration Management via >>> Sidecar for discussion by the community. >>> >>> CASSSIDECAR-266[1] introduced Cassandra process lifecycle management >>> capabilities to Sidecar, giving operators the ability to start and stop >>> Cassandra instances programmatically. However, Sidecar currently has no way >>> to manipulate the configuration files that those instances consume at >>> startup. >>> >>> Many Cassandra settings (memtable configuration, SSTable settings, >>> storage_compatibility_mode) cannot be modified at runtime via JMX/CQL and >>> must be set in cassandra.yaml or JVM options files, requiring a restart to >>> take effect. Managing these files manually or through custom tooling is >>> cumbersome and lacks a stable API. >>> >>> This CEP extends Sidecar's lifecycle management by adding configuration >>> management capabilities for persisted configuration artifacts. It >>> introduces a REST API for reading and updating cassandra.yaml and JVM >>> options, a pluggable ConfigurationProvider abstraction for integration with >>> centralized configuration systems (etcd, Consul, or custom backends), and >>> version-aware validation to prevent startup failures. >>> >>> This CEP also serves as a prerequisite for future Cassandra upgrades via >>> Sidecar. For example, upgrading from Cassandra 4 to Cassandra 5 requires >>> updating storage_compatibility_mode in cassandra.yaml. The configuration >>> management capabilities introduced here will enable Sidecar to orchestrate >>> such upgrades by updating configuration artifacts alongside binary version >>> changes. >>> >>> The CEP is linked here: >>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar >>> >>> Looking forward to your feedback! >>> >>> Thanks, >>> >>> Paulo >>> >>> [1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266 >>> >>>
