RE: Datacenter decommissioning on Cassandra 4.1.4

Michalis Kotsiouros (EXT) via user Mon, 08 Apr 2024 07:26:20 -0700

Hello Jon and Jeff,
Thanks a lot for your replies.
I completely get your points.
Some more clarification about my issue.
When trying to update the Replication before the decommission, I get the 
following error message when I remove the replication for system_auth kesypace.
ConfigurationException: Following datacenters have active nodes and must be 
present in replication options for keyspace system_auth: [datacenter1]


This error message does not appear in the rest of the application keyspaces.
So, may I change the procedure to:

  1.  Make sure no clients are still writing to any nodes in the datacenter.
  2.  Run a full repair with nodetool repair.
  3.  Change all keyspaces so they no longer reference the datacenter being 
removed apart from system_auth keyspace.
  4.  Run nodetool decommission using the --force option on every node in the 
datacenter being removed.
  5.  Change system_auth keyspace so they no longer reference the datacenter 
being removed.
BR
MK



From: Jeff Jirsa <jji...@gmail.com>
Sent: April 08, 2024 17:19
To: cassandra <user@cassandra.apache.org>
Cc: Michalis Kotsiouros (EXT) <michalis.kotsiouros....@ericsson.com>
Subject: Re: Datacenter decommissioning on Cassandra 4.1.4

To Jon’s point, if you remove from replication after step 1 or step 2 (probably 
step 2 if your goal is to be strictly correct), the nodetool decommission phase 
becomes almost a no-op.

If you use the order below, the last nodes to decommission will cause those 
surviving machines to run out of space (assuming you have more than a few nodes 
to start)




On Apr 8, 2024, at 6:58 AM, Jon Haddad 
<j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote:

You shouldn’t decom an entire DC before removing it from replication.

—

Jon Haddad
Rustyrazorblade Consulting
rustyrazorblade.com<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-1624a77accb6d839&q=1&e=8a954d2d-17da-40df-8732-bdcc7893179a&u=http%3A%2F%2Frustyrazorblade.com%2F>


On Mon, Apr 8, 2024 at 6:26 AM Michalis Kotsiouros (EXT) via user 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>> wrote:
Hello community,
In our deployments, we usually rebuild the Cassandra datacenters for 
maintenance or recovery operations.
The procedure used since the days of Cassandra 3.x was the one documented in 
datastax documentation. Decommissioning a datacenter | Apache Cassandra 3.x 
(datastax.com)<https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsDecomissionDC.html>
After upgrading to Cassandra 4.1.4, we have realized that there are some 
stricter rules that do not allo to remove the replication when active Cassandra 
nodes still exist in a datacenter.
This check makes the above-mentioned procedure obsolete.
I am thinking to use the following as an alternative:

  1.  Make sure no clients are still writing to any nodes in the datacenter.
  2.  Run a full repair with nodetool repair.
  3.  Run nodetool decommission using the --force option on every node in the 
datacenter being removed.
  4.  Change all keyspaces so they no longer reference the datacenter being 
removed.

What is the procedure followed by other users? Do you see any risk following 
the proposed procedure?

BR
MK

RE: Datacenter decommissioning on Cassandra 4.1.4

Reply via email to