[ https://issues.apache.org/jira/browse/CASSANDRA-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882427#comment-17882427 ]
Michael Semb Wever commented on CASSANDRA-13704: ------------------------------------------------ I'm seeing failures in 5.0 {noformat} failed on teardown with "Unexpected error found in node logs (see stdout for full details). Errors: [[node1] 'ERROR [MutationStage-1] 2024-09-17 13:03:31,667 JVMStabilityInspector.java:70 - Exception in thread Thread[MutationStage-1,10,SharedPool]\njava.lang.IllegalArgumentException: Conflicting replica added (expected unique endpoints): Full(/127.0.0.3:7000,(4611686018427387904,0]); existing: Full(/127.0.0.3:7000,(281474976710656,0])\n\tat org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:102)\n\tat org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:79)\n\tat org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:160)\n\tat org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:166)\n\tat org.apache.cassandra.locator.EndpointsForToken.copyOf(EndpointsForToken.java:162)\n\tat org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalAndPendingReplicasForToken(AbstractReplicationStrategy.java:126)\n\tat org.apache.cassandra.service.StorageService.isEndpointValidForWrite(StorageService.java:5162)\n\tat org.apache.cassandra.db.AbstractMutationVerbHandler.isOutOfRangeMutation(AbstractMutationVerbHandler.java:79)\n\tat org.apache.cassandra.db.AbstractMutationVerbHandler.processMessage(AbstractMutationVerbHandler.java:53)\n\tat org.apache.cassandra.db.AbstractMutationVerbHandler.doVerb(AbstractMutationVerbHandler.java:44)\n\tat org.apache.cassandra.db.ReadRepairVerbHandler.doVerb(ReadRepairVerbHandler.java:35)\n\tat org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)\n\tat org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)\n\tat org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)\n\tat org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)\n\tat org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)\n\tat org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.base/java.lang.Thread.run(Thread.java:829)']" {noformat} > Safer handling of out of range tokens > ------------------------------------- > > Key: CASSANDRA-13704 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13704 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Observability > Reporter: Sam Tunnicliffe > Assignee: Caleb Rackliffe > Priority: Urgent > Fix For: 4.0.x, 4.1.x, 5.0.x > > Attachments: CASSANDRA-13704_5-0_23_ci_summary.html, > CASSANDRA-13704_5-0_23_results_details.tar.xz, ci_summary-1.html, > ci_summary-2.html, ci_summary.html, result_details.tar-1.gz, > result_details.tar-2.gz, result_details.tar.gz > > Time Spent: 7h 10m > Remaining Estimate: 0h > > It is possible for nodes to have a divergent view of the ring, which can > result in some operations being sent to the wrong nodes. This is an umbrella > ticket to mitigate such issues by adding logging when a node is asked to > perform an operation for tokens it does not own. This will be useful for > detecting when the nodes' views of the ring diverge, which is not highly > visible at the moment, and also for post-hoc analysis. > It may also be beneficial to straight up reject certain operations, though > this will need to balance the risk of performing those ops against the > consequences rejecting them has on availability. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org