Hello Jon and Jeff, Thanks a lot for your replies. I completely get your points. Some more clarification about my issue. When trying to update the Replication before the decommission, I get the following error message when I remove the replication for system_auth kesypace. ConfigurationException: Following datacenters have active nodes and must be present in replication options for keyspace system_auth: [datacenter1]
This error message does not appear in the rest of the application keyspaces. So, may I change the procedure to: 1. Make sure no clients are still writing to any nodes in the datacenter. 2. Run a full repair with nodetool repair. 3. Change all keyspaces so they no longer reference the datacenter being removed apart from system_auth keyspace. 4. Run nodetool decommission using the --force option on every node in the datacenter being removed. 5. Change system_auth keyspace so they no longer reference the datacenter being removed. BR MK From: Jeff Jirsa <jji...@gmail.com> Sent: April 08, 2024 17:19 To: cassandra <user@cassandra.apache.org> Cc: Michalis Kotsiouros (EXT) <michalis.kotsiouros....@ericsson.com> Subject: Re: Datacenter decommissioning on Cassandra 4.1.4 To Jon’s point, if you remove from replication after step 1 or step 2 (probably step 2 if your goal is to be strictly correct), the nodetool decommission phase becomes almost a no-op. If you use the order below, the last nodes to decommission will cause those surviving machines to run out of space (assuming you have more than a few nodes to start) On Apr 8, 2024, at 6:58 AM, Jon Haddad <j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote: You shouldn’t decom an entire DC before removing it from replication. — Jon Haddad Rustyrazorblade Consulting rustyrazorblade.com<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-1624a77accb6d839&q=1&e=8a954d2d-17da-40df-8732-bdcc7893179a&u=http%3A%2F%2Frustyrazorblade.com%2F> On Mon, Apr 8, 2024 at 6:26 AM Michalis Kotsiouros (EXT) via user <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> wrote: Hello community, In our deployments, we usually rebuild the Cassandra datacenters for maintenance or recovery operations. The procedure used since the days of Cassandra 3.x was the one documented in datastax documentation. Decommissioning a datacenter | Apache Cassandra 3.x (datastax.com)<https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsDecomissionDC.html> After upgrading to Cassandra 4.1.4, we have realized that there are some stricter rules that do not allo to remove the replication when active Cassandra nodes still exist in a datacenter. This check makes the above-mentioned procedure obsolete. I am thinking to use the following as an alternative: 1. Make sure no clients are still writing to any nodes in the datacenter. 2. Run a full repair with nodetool repair. 3. Run nodetool decommission using the --force option on every node in the datacenter being removed. 4. Change all keyspaces so they no longer reference the datacenter being removed. What is the procedure followed by other users? Do you see any risk following the proposed procedure? BR MK