On 3/26/2026 12:30 PM, Paulo Motta wrote:
> Is there a write-up somewhere of how Sidecar auth works end-to-end?
I don't think there is a formal doc available but you can find more
information in the PR that introduced it:
- https://issues.apache.org/jira/browse/CASSSIDECAR-161
-
https://github.com/apache/cassandra-sidecar/commit/5a19e3448038fa4b2e9f497ab94dbbe911f44c29
Thanks - I'll give those a read.
> Not directly related: in that section it says "Sidecar does not
depend on Cassandra being running", but in the auth section it says
"Permissions are resolved from Sidecar's
|sidecar_internal.role_permissions_v1| tables". Can Sidecar access
tables on the node directly, without going through the main Cassandra
process?
That's a good point. When the managed node is offline it's possible to
access authorization tables through other nodes, by specifying
multiple CQL contact points in sidecar.yml. Alternatively it's
possible to set up admin_identities in sidecar.yml allowing admins to
perform config changes in the event that auth tables are unavailable.
I have updated the authorization section with a note about this.
Thanks for doing that.
One thing to maybe be intentional about avoiding is creating a
dependency loop where a cluster gets into a state where it can't start
without sidecar, and sidecar can't start without the cluster. It's
probably more important to avoid dependencies in Cassandra that would
prevent a cluster from cold-starting. Imagine a rack or datacenter
losing power and a cluster having to come back online as quickly as
possible. Ideally, it should be more or less automatic as the servers
reboot but if a circular dependency is introduced, recovery from that
kind of outage could be difficult.
Let me know if you have any remaining questions or comments before we
bring this to a vote.
None of my comments above should be in the way of that.
Thanks -- Joel.
On Wed, Mar 25, 2026 at 6:16 PM Joel Shepherd <[email protected]> wrote:
Thanks, Paulo - I think I'm still trying to form a mental model of
where Sidecar's responsibilities start and end. It sounds like for
this proposal, it's scope is basically applying the "current"
configuration to the node(s), but the configuration management
itself needs to be done above Sidecar. Makes sense.
A couple questions below for my own learning, and a response as
well...
On 3/18/2026 3:14 PM, Paulo Motta wrote:
Thanks for the feedback Joel! See follow-up below:
> * Authorization - What are the authorization controls.
Good call! This will use Sidecar's existing authorization
mechanism per HTTP endpoint. Two new permissions will be added:
CONFIGURATION:READ and CONFIGURATION:MODIFY to control reading or
updating configs. I've updated the doc with a new section about
authorization.
Is there a write-up somewhere of how Sidecar auth works end-to-end?
> * Integrity - I see you're using hashes for conflict detection.
Have you
considered using them as integrity checks as well: e.g., to guarantee
the configuration deployed for the node/instance to load at
runtime is
the same configuration computed by the configuration manager (I think
that's the right component)?
The deployed configuration is guaranteed to be the same
configuration computed by the configuration manager, since the
runtime configuration is always refreshed during instance
startup. If the runtime configuration is corrupted it would be
overwritten by the recomputed config. Let me know if this cover
the scenario you have in mind or if I missed something.
I guess I was thinking a little beyond the scope of this change,
to: how does the node know that the configuration hasn't been
altered from what it's intended to be, regardless of whether
that's through cosmic ray flipping a bit, or a misbehaving file
system corrupting bytes, or manual modification? With the
introduction of hashes, you may have a path towards hardening this
by enabling the node to validate the integrity of its
configuration at runtime, which will be more useful as Sidecar
makes automated deployments of configuration easier. Color me
paranoid.
But, I acknowledge this is outside the scope of this CEP.
> * Rollback/forward - If I push a bad configuration change, how
do I as
the the administrator respond to that?
Change tracking is explicitly not a goal of this CEP to keep the
scope limited. When a bad configuration is pushed, the operator
would need to manually revert by submitting another PATCH request
undoing the bad configuration. An external RCS would need to be
used to keep track of config history if needed. I've added a new
future work entry to support change tracking natively. I've also
added an Operational Guide section with an overview of how this
is expected to be used. Let me know if this makes sense.
Thanks so much for doing that: it's really helpful to talk about
how the user will work with the feature up-front.
Not directly related: in that section it says "Sidecar does not
depend on Cassandra being running", but in the auth section it
says "Permissions are resolved from Sidecar's
|sidecar_internal.role_permissions_v1| tables". Can Sidecar access
tables on the node directly, without going through the main
Cassandra process?
Thanks again -- Joel.
On Tue, 17 Mar 2026 at 17:04 Joel Shepherd <[email protected]>
wrote:
Hi Paulo - Interesting CEP, and potentially very useful: thanks!
I was wondering about several things as I was reading through it:
* Authorization - Particularly for operations that mutate
configuration (either in the store or at run-time for the
node). What are the authorization controls.
* Integrity - I see you're using hashes for conflict
detection. Have you considered using them as integrity checks
as well: e.g., to guarantee the configuration deployed for
the node/instance to load at runtime is the same
configuration computed by the configuration manager (I think
that's the right component)? This would be a guard against
bugs, network gremlins, file system gremlins, etc., quietly
corrupting the configuration that the node will eventually read.
* Visibility - As an 'administrator' how do I determine how
much of my cluster is running on the latest configuration,
and which nodes specifically aren't? Is it up to me to
implement that monitoring?
* Rollback/forward - If I push a bad configuration change,
how do I as the the administrator respond to that? For
example, is there an assumption that I'll be managing my
configuration in a RCS somewhere and will be expected to
quickly retrieve a known-good older revision from it and push
it through sidecar? It might be helpful to have a "user
experience" section in the CEP to describe how you envision
users managing their cluster's configuration through this
tool: what they're responsible for, what the tool is
responsible for.
Thanks -- Joel.
On 3/17/2026 9:32 AM, Paulo Motta wrote:
Hi everyone,
I'd like to propose CEP-62: Cassandra Configuration
Management via Sidecar for discussion by the community.
CASSSIDECAR-266[1] introduced Cassandra process lifecycle
management capabilities to Sidecar, giving operators the
ability to start and stop Cassandra instances
programmatically. However, Sidecar currently has no way to
manipulate the configuration files that those instances
consume at startup.
Many Cassandra settings (memtable configuration, SSTable
settings, storage_compatibility_mode) cannot be modified at
runtime via JMX/CQL and must be set in cassandra.yaml or JVM
options files, requiring a restart to take effect. Managing
these files manually or through custom tooling is cumbersome
and lacks a stable API.
This CEP extends Sidecar's lifecycle management by adding
configuration management capabilities for persisted
configuration artifacts. It introduces a REST API for
reading and updating cassandra.yaml and JVM options, a
pluggable ConfigurationProvider abstraction for integration
with centralized configuration systems (etcd, Consul, or
custom backends), and version-aware validation to prevent
startup failures.
This CEP also serves as a prerequisite for future Cassandra
upgrades via Sidecar. For example, upgrading from Cassandra
4 to Cassandra 5 requires updating
storage_compatibility_mode in cassandra.yaml. The
configuration management capabilities introduced here will
enable Sidecar to orchestrate such upgrades by updating
configuration artifacts alongside binary version changes.
The CEP is linked here:
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-62%3A+Cassandra+Configuration+Management+via+Sidecar
Looking forward to your feedback!
Thanks,
Paulo
[1] - https://issues.apache.org/jira/browse/CASSSIDECAR-266