Agree. We were planning same change and tested multiple scenarios with conclusion that it needs downtime to be on safer side. With right automation in place implementation can be made faster but not without downtime at least in our case.
On Mon, Jul 6, 2020, 1:26 PM Durity, Sean R <sean_r_dur...@homedepot.com> wrote: > I plan downtime for changes to security settings like this. I could not > come up with a way to not have degraded access or inconsistent data or > something else bad. The foundational issue is that unencrypted nodes cannot > communicate with encrypted ones. > > > > I depend on Cassandra’s high availability for many things, but I always > caution my teams that security-related changes will usually require an > outage. When I can have an outage window, this kind of change is very quick. > > > > Sean Durity > > > > *From:* Egan Neuhengen <egan.neuhen...@iovation.com> > *Sent:* Monday, July 6, 2020 12:50 PM > *To:* user@cassandra.apache.org > *Subject:* [EXTERNAL] Safely enabling internode TLS encryption on live > cassandra cluster > > > > Hello, > > > > We are trying to come up with a safe way to turn on internode (NOT > client-server) TLS encryption on a cassandra cluster with two datacenters, > anywhere from 3 to 20 nodes in each DC, 3+ racks in each DC. Cassandra > version is 3.11.6, OS is CentOS 7. We have full control over cassandra > configuration and operation, and a decent amount of control over client > driver configuration. We're looking for a way to enable internode TLS with > no period of time in which clients cannot connect to the cluster or clients > can connect but receive inconsistent or incorrect data results. > > > > Our understanding is that in 3.11, cassandra internode TLS encryption > configuration (server_encryption_options::internode_encryption) can be set > to none, all, dc, or rack, and "none" means the node will only send and > receive unencrypted data, any other involves varying scope of only sending > and receiving encrypted data; an "optional" setting only appears in the > unreleased 4.0. The problem we run into is that no matter which scope we > use, we end up with a period of time in which two different parts of the > cluster won't be able to talk to each other, and so clients might get > different answers depending on which part they talk to. In this scenario, > clients can be shifted to talk to only one DC for a limited time, but > cannot transition directly from only communicating with one DC to only > communicating to the other; some period of time must be spent communicating > to both, however small, between those two states. > > > > Is there a way to do this while avoiding downtime and wrong-answer > problems? > > ------------------------------ > > The information in this Internet Email is confidential and may be legally > privileged. It is intended solely for the addressee. Access to this Email > by anyone else is unauthorized. If you are not the intended recipient, any > disclosure, copying, distribution or any action taken or omitted to be > taken in reliance on it, is prohibited and may be unlawful. When addressed > to our clients any opinions or advice contained in this Email are subject > to the terms and conditions expressed in any applicable governing The Home > Depot terms of business or client engagement letter. The Home Depot > disclaims all responsibility and liability for the accuracy and content of > this attachment and for any damages or losses arising from any > inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other > items of a destructive nature, which may be contained in this attachment > and shall not be liable for direct, indirect, consequential or special > damages in connection with this e-mail message or its attachment. >