Caleb Rackliffe created CASSANDRA-20668: -------------------------------------------
Summary: SAI range queries can spend too much time in TrieMemoryIndex$Collector#updateLastQueueSize() doing unproductive work Key: CASSANDRA-20668 URL: https://issues.apache.org/jira/browse/CASSANDRA-20668 Project: Apache Cassandra Issue Type: Improvement Components: Feature/SAI Reporter: Caleb Rackliffe Assignee: Caleb Rackliffe Attachments: e20ea330-3519-11f0-912e-31b7d282b812-flamegraph.html {{TrieMemoryIndex}} performs range queries by traversing the in-memory trie, finding the terms in the range, and adding the primary keys for each term to a priority queue (because they need to be in order). At one point, to make sure we didn't have an issue with the priority queue's backing array being constantly resized for large ranges, we started tracking the last queue size to reuse it when creating a new {{Collector}} for the query. In some recent performance testing, we've seen up to 10% of samples in a CPU flame graph (attached) in {{updateLastQueueSize()}}, which needs to be addressed. The naïve question here is, of course, why the priority queue itself isn't thread local. There are also some usages of lambdas and other micro-optimizations that might be possible... -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org