[jira] [Created] (CASSANDRA-20668) SAI range queries can spend too much time in TrieMemoryIndex$Collector#updateLastQueueSize() doing unproductive work

Caleb Rackliffe (Jira) Wed, 21 May 2025 10:32:21 -0700

Caleb Rackliffe created CASSANDRA-20668:
-------------------------------------------


             Summary:  SAI range queries can spend too much time in 
TrieMemoryIndex$Collector#updateLastQueueSize() doing unproductive work
                 Key: CASSANDRA-20668
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20668
             Project: Apache Cassandra
          Issue Type: Improvement
          Components: Feature/SAI
            Reporter: Caleb Rackliffe
            Assignee: Caleb Rackliffe
         Attachments: e20ea330-3519-11f0-912e-31b7d282b812-flamegraph.html

{{TrieMemoryIndex}} performs range queries by traversing the in-memory trie, 
finding the terms in the range, and adding the primary keys for each term to a 
priority queue (because they need to be in order). At one point, to make sure 
we didn't have an issue with the priority queue's backing array being 
constantly resized for large ranges, we started tracking the last queue size to 
reuse it when creating a new {{Collector}} for the query. In some recent 
performance testing, we've seen up to 10% of samples in a CPU flame graph 
(attached) in {{updateLastQueueSize()}}, which needs to be addressed.

The naïve question here is, of course, why the priority queue itself isn't 
thread local. There are also some usages of lambdas and other 
micro-optimizations that might be possible...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-20668) SAI range queries can spend too much time in TrieMemoryIndex$Collector#updateLastQueueSize() doing unproductive work

Reply via email to