[ https://issues.apache.org/jira/browse/CASSANDRA-20514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17940929#comment-17940929 ]
Michael Semb Wever edited comment on CASSANDRA-20514 at 4/4/25 10:10 AM: ------------------------------------------------------------------------- this broke CI (`ant check` fails). https://ci-cassandra.apache.org/job/Cassandra-5.0/430/cloudbees-pipeline-explorer/?filter=1154 was (Author: michaelsembwever): this broke CI (`ant lint` fails). https://ci-cassandra.apache.org/job/Cassandra-5.0/430/cloudbees-pipeline-explorer/?filter=1154 > Paxos mixed mode infinite loop with ttl'd state > ----------------------------------------------- > > Key: CASSANDRA-20514 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20514 > Project: Apache Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions > Reporter: Blake Eggleston > Assignee: Blake Eggleston > Priority: Normal > Fix For: 4.1.x, 5.0.x, 5.x > > > This is similar to the bug fixed in CASSANDRA-20493. > CEP-14 changed the ttl behavior of legacy paxos state to expire based off the > ballot time of the operation being persisted, not the time a commit is > persisted. This eliminated the race addressed by CASSANDRA-12043, and so the > check it added to the most recent commit prepare logic was removed. > When operating in mixed mode though, this race can still be a problem. If a > 4.1 or higher node is coordinating a paxos operation with 2 or more replicas > on 4.0 or lower, this race becomes a problem again. You need 3 things to make > this an infinite loop > 1. a 4.1 node coordinating a paxos operation with 2x 4.0 replicas > 2. replica A) a 4.0 node returns a most recent commit for a ballot that's > could have been ttld > 3. replica B) a 4.0 node has ttl'd that mrc AND converted the ttld cells into > tombstones > The 4.1 coordinator receives the mrc from replica A, but since it no longer > disregards missing most recent commits past the ttl window, it sends the > "missing" commit to replica B. Since replica B now has a tombstone for that > mrc, and tombstones win when reconciled with live cells, even ones with ttls, > the commit is a noop and it continues to report nothing for its mrc value > when the coordinator restarts the prepare phase. This loops until the query > times out -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org