Hi Team,

When we are trying to execute a repair with token range on cassandra
3.11.13 it is getting failed with below errors.

java.lang.RuntimeException: Repair job has failed with the error message:
[2025-04-16 13:18:15,587] Some repair failed
at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:116)
at
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)

We have followed below steps to fix this issue

1) nodetool repair on keyspace to repair the data. It failed.
2) We have executed the nodetool scrub command to fix the sstables. it
completed sucessfully
3) we tried to execute the repair at token range level and it getting
failed with below error

WARN [RepairJobTask:2] 2025-04-15 14:23:54,525 RepairJob.java:131 - [repair
#ff336760-19d6-11f0-ba1a-8115f9d2b6b2] recents sync failed ERROR
[RepairJobTask:2] 2025-04-15 14:23:54,525 RepairSession.java:295 - [repair
#ff336760-19d6-11f0-ba1a-8115f9d2b6b2] Session completed with the following
error org.apache.cassandra.exceptions.RepairException: [repair
#ff336760-19d6-11f0-ba1a-8115f9d2b6b2 on recent/recents,
[(-3168661918511011379,-3161883880802608714]]] Sync failed between /
10.4.33.143 and /10.4.33.138 at
org.apache.cassandra.repair.RemoteSyncTask.syncComplete(RemoteSyncTask.java:67)
at
org.apache.cassandra.repair.RepairSession.syncComplete(RepairSession.java:210)
at
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:498)
at
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:162)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:69)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
at java.lang.Thread.run(Thread.java:750) ERROR [RepairJobTask:2] 2025-04-15
14:23:54,526 RepairRunnable.java:306 - Repair session
ff336760-19d6-11f0-ba1a-8115f9d2b6b2 for range
[(-3168661918511011379,-3161883880802608714]] failed with error [repair
#ff336760-19d6-11f0-ba1a-8115f9d2b6b2 on recent/recents,
[(-3168661918511011379,-3161883880802608714]]] Sync failed between /
10.4.33.143 and /10.4.33.138
org.apache.cassandra.exceptions.RepairException: [repair
#ff336760-19d6-11f0-ba1a-8115f9d2b6b2 on recent/recents,
[(-3168661918511011379,-3161883880802608714]]] Sync failed between /
10.4.33.143 and /10.4.33.138 at
org.apache.cassandra.repair.RemoteSyncTask.syncComplete(RemoteSyncTask.java:67)
at
org.apache.cassandra.repair.RepairSession.syncComplete(RepairSession.java:210)
at
org.apache.cassandra.service.ActiveRepairService.handleMessage(ActiveRepairService.java:498)
at
org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:162)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:69)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
at java.lang.Thread.run(Thread.java:750)

We also took the dump of data for this token range and analysed the data
separately to check if data is corrupted or not but we could not find any
analomly with data, if it seems to be clean.

Please suggest further steps to fix the repair in our cluster.

Regards,
Soyal Badkur

Reply via email to