[ 
https://issues.apache.org/jira/browse/CASSANDRA-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882427#comment-17882427
 ] 

Michael Semb Wever edited comment on CASSANDRA-13704 at 9/17/24 2:42 PM:
-------------------------------------------------------------------------

I'm seeing failures in 5.0
{noformat}
failed on teardown with "Unexpected error found in node logs (see stdout for 
full details). Errors: [[node1] 'ERROR [MutationStage-1] 2024-09-17 
13:03:31,667 JVMStabilityInspector.java:70 - Exception in thread 
Thread[MutationStage-1,10,SharedPool]
java.lang.IllegalArgumentException: Conflicting replica added (expected unique 
endpoints): Full(/127.0.0.3:7000,(4611686018427387904,0]); existing: 
Full(/127.0.0.3:7000,(281474976710656,0])
at 
org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:102)
at 
org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:79)
at 
org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:160)
at 
org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:166)
at 
org.apache.cassandra.locator.EndpointsForToken.copyOf(EndpointsForToken.java:162)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalAndPendingReplicasForToken(AbstractReplicationStrategy.java:126)
at 
org.apache.cassandra.service.StorageService.isEndpointValidForWrite(StorageService.java:5162)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.isOutOfRangeMutation(AbstractMutationVerbHandler.java:79)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.processMessage(AbstractMutationVerbHandler.java:53)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.doVerb(AbstractMutationVerbHandler.java:44)
at 
org.apache.cassandra.db.ReadRepairVerbHandler.doVerb(ReadRepairVerbHandler.java:35)
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)at
 java.base/java.lang.Thread.run(Thread.java:829)']"
{noformat}
in
 - dtest-novnode jdk11 57/64 / 
dtest-novnode.consistent_bootstrap_test.TestBootstrapConsistency.test_consistent_reads_after_move
-  dtest-latest jdk11 43/64 / 
dtest-latest.materialized_views_test.TestMaterializedViews.test_insert_during_range_movement_rf3


was (Author: michaelsembwever):
I'm seeing failures in 5.0
{noformat}
failed on teardown with "Unexpected error found in node logs (see stdout for 
full details). Errors: [[node1] 'ERROR [MutationStage-1] 2024-09-17 
13:03:31,667 JVMStabilityInspector.java:70 - Exception in thread 
Thread[MutationStage-1,10,SharedPool]
java.lang.IllegalArgumentException: Conflicting replica added (expected unique 
endpoints): Full(/127.0.0.3:7000,(4611686018427387904,0]); existing: 
Full(/127.0.0.3:7000,(281474976710656,0])
at 
org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:102)
at 
org.apache.cassandra.locator.EndpointsForToken$Builder.add(EndpointsForToken.java:79)
at 
org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:160)
at 
org.apache.cassandra.locator.ReplicaCollection$Builder.addAll(ReplicaCollection.java:166)
at 
org.apache.cassandra.locator.EndpointsForToken.copyOf(EndpointsForToken.java:162)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalAndPendingReplicasForToken(AbstractReplicationStrategy.java:126)
at 
org.apache.cassandra.service.StorageService.isEndpointValidForWrite(StorageService.java:5162)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.isOutOfRangeMutation(AbstractMutationVerbHandler.java:79)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.processMessage(AbstractMutationVerbHandler.java:53)
at 
org.apache.cassandra.db.AbstractMutationVerbHandler.doVerb(AbstractMutationVerbHandler.java:44)
at 
org.apache.cassandra.db.ReadRepairVerbHandler.doVerb(ReadRepairVerbHandler.java:35)
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)at
 java.base/java.lang.Thread.run(Thread.java:829)']"
{noformat}

> Safer handling of out of range tokens
> -------------------------------------
>
>                 Key: CASSANDRA-13704
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13704
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Coordination, Legacy/Observability
>            Reporter: Sam Tunnicliffe
>            Assignee: Caleb Rackliffe
>            Priority: Urgent
>             Fix For: 4.0.x, 4.1.x, 5.0.x
>
>         Attachments: CASSANDRA-13704_5-0_23_ci_summary.html, 
> CASSANDRA-13704_5-0_23_results_details.tar.xz, 
> CASSANDRA-13704_5-0_24_ci_summary.html, 
> CASSANDRA-13704_5-0_24_results_details.tar.xz, ci_summary-1.html, 
> ci_summary-2.html, ci_summary.html, result_details.tar-1.gz, 
> result_details.tar-2.gz, result_details.tar.gz
>
>          Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> It is possible for nodes to have a divergent view of the ring, which can 
> result in some operations being sent to the wrong nodes. This is an umbrella 
> ticket to mitigate such issues by adding logging when a node is asked to 
> perform an operation for tokens it does not own. This will be useful for 
> detecting when the nodes' views of the ring diverge, which is not highly 
> visible at the moment, and also for post-hoc analysis.
> It may also be beneficial to straight up reject certain operations, though 
> this will need to balance the risk of performing those ops against the 
> consequences rejecting them has on availability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to