[jira] [Updated] (CASSANDRA-21010) CEP-46: Witness enable/disable implementation, testing, documentation

Ariel Weisberg (Jira) Mon, 10 Nov 2025 13:56:07 -0800


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-21010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ariel Weisberg updated CASSANDRA-21010:
---------------------------------------
    Description: 
The original process for enabling/disabling witnesses was that enabling them 
entailed changing the schema and then running nodetool cleanup. I believe this 
is still the case.

The original process for disabling witnesses is reduce the number of witnesses 
to 0 and then increase the number of full replicas. With say 6 replicas total 
and only 4 full replicas this means f will be 1 in terms of availability. In 
terms of write durability since quorum will be 3/4 you can still lose 2 nodes 
without losing any writes. This may be undesirable, but fine. If you ran RF=5 
(or had only 3 full replicas) write durability would only allow for 1 lost 
replica which is more problematic. You would also immediately lose availability 
with only 3 full replicas and 3 witnesses when you attempted to disable the 
witnesses.

The code that documents this is in {{AlterKeyspaceStatement}}:
{code:java}
       //This is true right now because the transition from transient -> full 
lacks the pending state
       //necessary for correctness. What would happen if we allowed this is 
that we would attempt
       //to read from a transient replica as if it were a full replica.
       if (oldFull > newFull && oldTrans > 0)
           throw new ConfigurationException("Can't add full replicas if there 
are any transient replicas. You must first remove all transient replicas, then 
change the # of full replicas, then add back the transient replicas");
{code}


It would be more optimal to have a pending state for witnesses and then add 
them back to the read data placement when repair completes.

  was:
The original process for enabling/disabling witnesses was that enabling them 
entailed changing the schema and then running nodetool cleanup. I believe this 
is still the case.

The original process for disabling witnesses is reduce the number of witnesses 
to 0 and then increase the number of full replicas. With say 6 replicas total 
and only 4 full replicas this means f will be 1 in terms of availability. In 
terms of write durability since quorum will be 3/4 you can still lose 2 nodes 
without losing any writes. This may be undesirable, but fine. If you ran RF=5 
(or had only 3 full replicas) write durability would only allow for 1 lost 
replica which is more problematic. You would also immediately lose availability 
with only 3 full replicas and 3 witnesses when you attempted to disable the 
witnesses.

The code that documents this is in {{AlterKeyspaceStatement}}:
{noformat}
       //This is true right now because the transition from transient -> full 
lacks the pending state
       //necessary for correctness. What would happen if we allowed this is 
that we would attempt
       //to read from a transient replica as if it were a full replica.
       if (oldFull > newFull && oldTrans > 0)
           throw new ConfigurationException("Can't add full replicas if there 
are any transient replicas. You must first remove all transient replicas, then 
change the # of full replicas, then add back the transient replicas");
{noformat}


It would be more optimal to have a pending state for witnesses and then add 
them back to the read data placement when repair completes.


> CEP-46: Witness enable/disable implementation, testing, documentation
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-21010
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21010
>             Project: Apache Cassandra
>          Issue Type: Sub-task
>            Reporter: Ariel Weisberg
>            Priority: Normal
>
> The original process for enabling/disabling witnesses was that enabling them 
> entailed changing the schema and then running nodetool cleanup. I believe 
> this is still the case.
> The original process for disabling witnesses is reduce the number of 
> witnesses to 0 and then increase the number of full replicas. With say 6 
> replicas total and only 4 full replicas this means f will be 1 in terms of 
> availability. In terms of write durability since quorum will be 3/4 you can 
> still lose 2 nodes without losing any writes. This may be undesirable, but 
> fine. If you ran RF=5 (or had only 3 full replicas) write durability would 
> only allow for 1 lost replica which is more problematic. You would also 
> immediately lose availability with only 3 full replicas and 3 witnesses when 
> you attempted to disable the witnesses.
> The code that documents this is in {{AlterKeyspaceStatement}}:
> {code:java}
>        //This is true right now because the transition from transient -> full 
> lacks the pending state
>        //necessary for correctness. What would happen if we allowed this is 
> that we would attempt
>        //to read from a transient replica as if it were a full replica.
>        if (oldFull > newFull && oldTrans > 0)
>            throw new ConfigurationException("Can't add full replicas if there 
> are any transient replicas. You must first remove all transient replicas, 
> then change the # of full replicas, then add back the transient replicas");
> {code}
> It would be more optimal to have a pending state for witnesses and then add 
> them back to the read data placement when repair completes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-21010) CEP-46: Witness enable/disable implementation, testing, documentation

Reply via email to