[
https://issues.apache.org/jira/browse/CASSANDRA-21095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18049911#comment-18049911
]
Ariel Weisberg commented on CASSANDRA-21095:
--------------------------------------------
What is the version of Cassandra where this was observed?
> Paxos V2 emits errors after node decommission
> ---------------------------------------------
>
> Key: CASSANDRA-21095
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21095
> Project: Apache Cassandra
> Issue Type: Bug
> Reporter: Paulo Henrique Abreu
> Priority: Normal
>
> During a controlled node decommission in a Cassandra cluster with Paxos V2
> enabled, the cluster continued to emit Paxos V2–related errors after the node
> had been fully removed from the ring. The errors indicate that Paxos V2
> attempted to reconcile or finalize Paxos state still referencing the
> decommissioned node, even though it was no longer part of the topology.
> No immediate data loss was observed, but the behavior caused persistent Paxos
> errors in the logs and represents operational risk for workloads relying on
> LWTs, potentially leading to retries, increased latency, or instability.
> As a workaround, Paxos was downgraded from Paxos V2 to Paxos V1 at the
> cluster level. With Paxos V1 enabled, the required node decommission
> operations completed successfully without Paxos-related errors. After the
> topology changes were finalized and the cluster stabilized, Paxos was
> switched back to Paxos V2.
> {{}}
> {{Error:}}
> {{WARN [Messaging-EventLoop-3-7] 2025-12-07 03:40:17,548
> OutboundConnection.java:491 -
> /10.10.12.144:7000->/10.10.12.144:7000-SMALL_MESSAGES-61fb9973 dropping
> message of type PAXOS2_CLEANUP_START_PREPARE_REQ due to error}}
> {{org.apache.cassandra.net.InvalidSerializedSizeException: Invalid serialized
> size; expected 5312, actual size at least 5311, for verb
> PAXOS2_CLEANUP_START_PREPARE_REQ}}
> {{ at
> org.apache.cassandra.net.OutboundConnection$EventLoopDelivery.doRun(OutboundConnection.java:819)}}
> {{ at
> org.apache.cassandra.net.OutboundConnection$Delivery.run(OutboundConnection.java:690)}}
> {{ at
> io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)}}
> {{ at
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)}}
> {{ at
> io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)}}
> {{ at
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)}}
> {{ at
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)}}
> {{ at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}}
> {{ at java.lang.Thread.run(Thread.java:748)}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]