[ https://issues.apache.org/jira/browse/CASSANDRA-20348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17940025#comment-17940025 ]
Brandon Williams commented on CASSANDRA-20348: ---------------------------------------------- This jira is for the devlopment of Apache Cassandra and, as you have experienced, makes for a poor vehicle for support. I recommend contacting the community for assistance on the user ML https://cassandra.apache.org/_/community.html > Issue with Merkle Tree Creation and Parent Repair Session Failure > ----------------------------------------------------------------- > > Key: CASSANDRA-20348 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20348 > Project: Apache Cassandra > Issue Type: Bug > Reporter: Aayush Gupta > Priority: Normal > > We encountered an error while attempting to validate a repair session on our > Cassandra cluster. The following issues were observed: > * The validation failed during the creation of a Merkle tree for the repair > session with ID {{a36dbed0-d787-11ef-84a0-59e4de22a9ca}} on the > {{mailbox/messages_by_id}} table. Several ranges of SSTables were involved in > the failure, and the logs indicate that the session could not be successfully > validated due to issues with these SSTables. > * The logs also show a failure in the parent repair session with ID > {{{}a3695200-d787-11ef-84a0-59e4de22a9ca{}}}. The error traceback reveals > that the {{ActiveRepairService}} could not retrieve the parent repair > session, leading to the failure of the repair process. This caused further > issues with repair message handling in the system. > * The CassandraDaemon logs indicate that the error was related to an invalid > or incomplete parent repair session, which was being handled by the > {{{}RepairMessageVerbHandler{}}}. The repair operation failed, and the system > attempted to remove the parent repair session, but was unable to proceed > successfully. > {*}Logs{*}: > # The validation failed due to the inability to create a Merkle tree for a > set of SSTables, and the parent repair session was marked as failed. > # The {{RepairMessageVerbHandler}} encountered an exception, leading to the > failure of the repair process, as seen in the attached stack traces. > {*}Request{*}: We would appreciate assistance in understanding the cause of > this failure, as well as any recommendations for resolving it. Specifically, > any guidance on recovering the parent repair session or resolving issues with > the Merkle tree creation would be helpful. > Logs : > [ERROR] [ValidationExecutor:196] 2025-01-20 17:38:23,942 Validator.java:237 - > Failed creating a merkle tree for [repair > #a36dbed0-d787-11ef-84a0-59e4de22a9ca on mailbox/messages_by_id, > [(5734736046850292958,5753303790549862573], > (5013853684854274868,5016711782125970873], > (2418110930062086372,2421594217423380863], > (3333294399526748440,3333803051887680609], > (-7673852896118720379,-7668669570613527038], > (-8735570610439541581,-8735399615559719989], > (3467842041551014709,3480644019042625019]]], /10.X.X.X:7000 (see log for > details) > [ERROR] [ValidationExecutor:196] 2025-01-20 17:38:23,942 > ValidationManager.java:173 - Validation failed. > java.lang.RuntimeException: Parent repair session with id = > a3695200-d787-11ef-84a0-59e4de22a9ca has failed. > at > org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:690) > at > org.apache.cassandra.db.repair.CassandraValidationIterator.getSSTablesToValidate(CassandraValidationIterator.java:116) > at > org.apache.cassandra.db.repair.CassandraValidationIterator.<init>(CassandraValidationIterator.java:203) > at > org.apache.cassandra.db.repair.CassandraTableRepairManager.getValidationIterator(CassandraTableRepairManager.java:51) > at > org.apache.cassandra.repair.ValidationManager.getValidationIterator(ValidationManager.java:89) > at > org.apache.cassandra.repair.ValidationManager.doValidation(ValidationManager.java:112) > at > org.apache.cassandra.repair.ValidationManager.access$000(ValidationManager.java:41) > at > org.apache.cassandra.repair.ValidationManager$1.call(ValidationManager.java:162) > at java.util.concurrent.FutureTask.run(FutureTask.java:277) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:826) > > > > Logs from 10.X.X.X:7000 : > > [ERROR] [AntiEntropyStage:1] 2025-01-20 17:38:23,724 > RepairMessageVerbHandler.java:212 - Got error, removing parent repair session > [ERROR] [AntiEntropyStage:1] 2025-01-20 17:38:23,725 CassandraDaemon.java:581 > - Exception in thread Thread[AntiEntropyStage:1,5,main] > java.lang.RuntimeException: java.lang.RuntimeException: Parent repair session > with id = a3695200-d787-11ef-84a0-59e4de22a9ca has failed. > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:215) > at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:432) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522) > at java.util.concurrent.FutureTask.run(FutureTask.java:277) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:826) > Caused by: java.lang.RuntimeException: Parent repair session with id = > a3695200-d787-11ef-84a0-59e4de22a9ca has failed. > at > org.apache.cassandra.service.ActiveRepairService.getParentRepairSession(ActiveRepairService.java:690) > at > org.apache.cassandra.repair.RepairMessageVerbHandler.previewKind(RepairMessageVerbHandler.java:55) > at > org.apache.cassandra.repair.RepairMessageVerbHandler.doVerb(RepairMessageVerbHandler.java:143) > ... 10 common frames omitted > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org