[ 
https://issues.apache.org/jira/browse/CASSANDRA-20147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Sims updated CASSANDRA-20147:
-------------------------------------
                        Authors: Elliott Sims
    Test and Documentation Plan: 
Ran BatchCommitLogStressTest, CommitLogStressTest, and GroupCommitLogStressTest 
plus stood up a single instance and ran "cassandra-stress write" against it.

I wish I could repro consistently to prove the bug is also fixed, but the 
initial production impact happened after several weeks of consistent load.
                         Status: Patch Available  (was: Open)

> Race in BatchCommitLogService in 4.1+
> -------------------------------------
>
>                 Key: CASSANDRA-20147
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20147
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log
>            Reporter: Elliott Sims
>            Priority: Normal
>             Fix For: 4.1.x, 5.0.x, 5.0.3
>
>
> Saw this crash in production on multiple hosts:
> {code:java}
> ERROR [MutationStage-165] 2024-12-16 04:36:35,914 
> JVMStabilityInspector.java:68 - Exception in thread 
> Thread[MutationStage-165,5,SharedPool]
> java.lang.RuntimeException: java.lang.Error: Maximum permit count exceeded
>         at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:108)
>         at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>         at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>         at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>         at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.Error: Maximum permit count exceeded
>         at 
> java.base/java.util.concurrent.Semaphore$Sync.tryReleaseShared(Semaphore.java:198)
>         at 
> java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1382)
>         at 
> java.base/java.util.concurrent.Semaphore.release(Semaphore.java:619)
>         at 
> org.apache.cassandra.db.commitlog.AbstractCommitLogService.requestExtraSync(AbstractCommitLogService.java:297)
>         at 
> org.apache.cassandra.db.commitlog.BatchCommitLogService.maybeWaitForSync(BatchCommitLogService.java:40)
>         at 
> org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:284)
>         at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:330)
>         at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:100)
>         at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:54)
>         at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:641)
>         at org.apache.cassandra.db.Keyspace.applyFuture(Keyspace.java:489)
>         at org.apache.cassandra.db.Mutation.applyFuture(Mutation.java:223)
>         at 
> org.apache.cassandra.db.MutationVerbHandler.doVerb(MutationVerbHandler.java:63)
>         at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>         at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>         ... 6 common frames omitted {code}
> Digging through, it looks like in 4.1.0 {{AbstractCommitLogService}} was 
> reworked to use Semaphore instead of LockSupport.  From a skim of the docs, 
> {{LockSupport.unpark()}} is safe to run even if the permit's already 
> available where {{Semaphore.release()}} is not.
> I think that means that if more writes arrive during the right part of a 
> flush, it double-decrements `haveWork`.  Once it's decremented and starts 
> throwing, I think it never successfully makes it to another acquire/try
> The least-invasive fix is probably to call `haveWork.drainPermits()` in 
> `requestExtraSync()` right before `release(1)` to ensure that 
> `havework.permits` is 0 right before we increment it.  That may technically 
> still leave a race, but an almost-impossibly small one rather than a 
> nontrivial one.  It might also be possible to catch that specific Exception 
> and run acquire() or (probably better) drainPermits() and recover.  That does 
> feel like a weird thing to do for a semaphore exception, but should be OK in 
> this case since it's just being used to sleep the worker loop instead of 
> actual concurrency control.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to