I would like to discuss (1). The problem is that we sometimes see that when
a metric like EstimatedPartitionCount is called while a compaction is in
progress, it might spin endlessly until compaction finishes.

The reason it spins is that (summarized here (2)) when compaction evaluates
some SSTable as expired / to be dropped, that SSTable will not be
physically removed until the very end of compaction and its SSTable
"tidier" is set which will eventually remove the files on disk after
transaction is finished etc.

When nobody references it, if EstimatedPartitionCount calls
selectAndReference on an SSTable, it will spin, because it waits for a
reference which is just not there because it was "unreferenced" already,
just not deleted. It is in some kind of a limbo.

Branimir Lambov suggested that it is probably not a good idea to reference
expired SSTables on CANONICAL (3)

My idea was to do this (4), isMarkedCompacted does

    public boolean isMarkedCompacted()
    {
        return tidy.global.obsoletion != null;
    }

which is not null when it is going to be removed from disk / nobody
references it. So, we will filter such SSTables out.

Jaydeepkumar Chovatia suggested that this approach might lead to "serious
repercussions" (5) and we should not touch it and we should do this instead
(6). However, that is not possible, because as Branimir mentioned:

"The selectAndReference call in estimatedPartitionCount was added recently
to fix a race that caused node failures when an sstable disappears while
it's being processed.".

Worth to say that the usage of selectAndReference seems to be not used
consistently across the metrics. That also opens an issue of whether we
should not approach this more holistically and cover all cases like this.

Do you also see (4) as risky? I built it for 4.0 and CI seems to pass minus
one test where we are testing this very CANONICAL functionality.

What are your takes here?

Regards

(1) https://issues.apache.org/jira/browse/CASSANDRA-19776
(2)
https://issues.apache.org/jira/browse/CASSANDRA-19776?focusedCommentId=17950873&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17950873
(3)
https://issues.apache.org/jira/browse/CASSANDRA-19776?focusedCommentId=17950979&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17950979
(4)
https://github.com/apache/cassandra/pull/4156/files#diff-92c8e689de9c33eb580a18eef6d7db02d1fb089183c32c8c8d99344d0964326c
(5)
https://issues.apache.org/jira/browse/CASSANDRA-19776?focusedCommentId=17952394&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17952394
(6)
https://issues.apache.org/jira/browse/CASSANDRA-19776?focusedCommentId=17952747&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17952747
(7)
https://app.circleci.com/pipelines/github/instaclustr/cassandra/5803/workflows/0935b05f-e246-463f-95fc-6dcc3822d611

Reply via email to