[jira] [Comment Edited] (CASSANDRA-19776) Spinning trying to capture readers

Branimir Lambov (Jira) Mon, 12 May 2025 02:22:07 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950888#comment-17950888
 ]


Branimir Lambov edited comment on CASSANDRA-19776 at 5/12/25 9:21 AM:
----------------------------------------------------------------------

bq. 4) compaction does not take into account expired sstables when doing "try 
(Refs<SSTableReader> refs = Refs.ref(actuallyCompact);" (actuallyCompact will 
_not_ contain expired ones), expired ones are just a logical part of 
compaction, but I do not see that we would actually reference them

It's not compaction's job here to ensure that there is a live reference to the 
expired sstables for other uses. The reference is taken because the operation 
has to read these sstables (only the ones it is actually compacting, which may 
be an even smaller subset because of parallelism) and such reads must always be 
protected with a reference.

Because the sstables are in the tracker, they must have an outstanding 
reference until they are moved out by the completion of a transaction. Starting 
a compaction on something (be it expired or not) cannot remove that outstanding 
reference, at the very least because this means that the rollback of a 
transaction cannot be done. I guess what we need to do is track if there's 
anything that removes a reference on the expired sstables before the completion 
of that transaction. That may have been done as a premature optimization to 
have some of the effect of expiration early.


was (Author: blambov):
> 4) compaction does not take into account expired sstables when doing "try 
> (Refs<SSTableReader> refs = Refs.ref(actuallyCompact);" (actuallyCompact will 
> _not_ contain expired ones), expired ones are just a logical part of 
> compaction, but I do not see that we would actually reference them

It's not compaction's job here to ensure that there is a live reference to the 
expired sstables for other uses. The reference is taken because the operation 
has to read these sstables (only the ones it is actually compacting, which may 
be an even smaller subset because of parallelism) and such reads must always be 
protected with a reference.

Because the sstables are in the tracker, they must have an outstanding 
reference until they are moved out by the completion of a transaction. Starting 
a compaction on something (be it expired or not) cannot remove that outstanding 
reference, at the very least because this means that the rollback of a 
transaction cannot be done. I guess what we need to do is track if there's 
anything that removes a reference on the expired sstables before the completion 
of that transaction. That may have been done as a premature optimization to 
have some of the effect of expiration early.

> Spinning trying to capture readers
> ----------------------------------
>
>                 Key: CASSANDRA-19776
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19776
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Legacy/Core
>            Reporter: Cameron Zemek
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>         Attachments: extract.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> On a handful of clusters we are noticing Spin locks occurring. I traced back 
> all the calls to the EstimatedPartitionCount metric (eg. 
> org.apache.cassandra.metrics:type=Table,keyspace=testks,scope=testcf,name=EstimatedPartitionCount)
> Using the following patched function:
> {code:java}
>     public RefViewFragment selectAndReference(Function<View, 
> Iterable<SSTableReader>> filter)
>     {
>         long failingSince = -1L;
>         boolean first = true;
>         while (true)
>         {
>             ViewFragment view = select(filter);
>             Refs<SSTableReader> refs = Refs.tryRef(view.sstables);
>             if (refs != null)
>                 return new RefViewFragment(view.sstables, view.memtables, 
> refs);
>             if (failingSince <= 0)
>             {
>                 failingSince = System.nanoTime();
>             }
>             else if (System.nanoTime() - failingSince > 
> TimeUnit.MILLISECONDS.toNanos(100))
>             {
>                 List<SSTableReader> released = new ArrayList<>();
>                 for (SSTableReader reader : view.sstables)
>                     if (reader.selfRef().globalCount() == 0)
>                         released.add(reader);
>                 NoSpamLogger.log(logger, NoSpamLogger.Level.WARN, 1, 
> TimeUnit.SECONDS,
>                                  "Spinning trying to capture readers {}, 
> released: {}, ", view.sstables, released);
>                 if (first)
>                 {
>                     first = false;
>                     try {
>                         throw new RuntimeException("Spinning trying to 
> capture readers");
>                     } catch (Exception e) {
>                         logger.warn("Spin lock stacktrace", e);
>                     }
>                 }
>                 failingSince = System.nanoTime();
>             }
>         }
>     }
>  {code}
> Digging into this code I found it will fail if any of the sstables are in 
> released state (ie. reader.selfRef().globalCount() == 0).
> See the extract.log for an example of one of these spin lock occurrences. 
> Sometimes these spin locks last over 5 minutes. Across the worst cluster with 
> this issue, I ran a log processing script that everytime the 'Spinning trying 
> to capture readers' was different to previous one it would output if the 
> released tables were in Compacting state. Every single occurrence has it spin 
> locking with released listing a sstable that is compacting.
> In the extract.log example its spin locking saying that nb-320533-big-Data.db 
> has been released. But you can see prior to it spinning that sstable is 
> involved in a compaction. The compaction completes at 01:03:36 and the 
> spinning stops. nb-320533-big-Data.db is deleted at 01:03:49 along with the 
> other 9 sstables involved in the compaction.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-19776) Spinning trying to capture readers

Reply via email to