[jira] [Commented] (CASSANDRA-20465) Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get usage

Sam Tunnicliffe (Jira) Tue, 25 Mar 2025 10:20:07 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938297#comment-17938297
 ]


Sam Tunnicliffe commented on CASSANDRA-20465:
---------------------------------------------

I think that in theory it would be possible to more or less go back to how this 
worked in 5.0. That is, we could cache the {{TableMetadata}} in 
{{TableMetadataRef}} and bring back something akin to {{TableMetadataRefCache}} 
as a kind of registry of {{TableMetadataRef}} instances. 

One thing to be mindful of is that when a {{ColumnFamilyStore}} is created or 
reloaded following a schema change in {{Keyspace::initCf}} we have to be sure 
to use the new metadata, as at that point {{ClusterMetadata.current()}} will 
still return the previous version so any config changes won't be present there. 
This was one of the things that lead us to rework this area way back in the 
early days of CEP-21 development. Thankfully, I think we overcame most of the 
issues that seemed to complicate this back then, but we never got round to 
revisiting this to simplify it.  

Alternatively, it should be quite possible to make the existing implementation 
cheaper without quite so much rework by using a lookup {{HashMap}} in 
{{ClusterMetadata}} and having {{TableMetadataRef::get}} use that instead of 
{{Schema.getTableMetadata}} (to trade btree for hash lookups). The tricky part 
here would _probably_ be dealing with the prev/next issue that I mentioned 
above (though again, it may be that this is not really such an issue anymore).  
    

So on the one hand, I think there probably is mileage to be had in revisiting 
{{TableMetadataRef}} itself. However, I also think looking at the usage and 
callsites is perhaps even more valuable in the long term. The 4 optimisations 
you listed here seem entirely sensible and in general it seems reasonable that 
most operations would want to obtain a single {{TableMetadata}} and use that 
instance/version consistently for the entire duration.

I'm happy to help if you want to tackle this.
 

> Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get 
> usage 
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20465
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20465
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema, Transactional Cluster Metadata
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: 5.1_cpu.html, image-2025-03-20-22-43-09-044.png, 
> image-2025-03-20-22-46-34-571.png, image-2025-03-20-22-52-40-818.png, 
> image-2025-03-20-22-53-25-001.png, image-2025-03-20-22-56-31-298.png, 
> image-2025-03-20-22-58-00-837.png
>
>
> Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get 
> invocation was cheap (it just returned a field value), now it does a lookup 
> from Schema every time with a search a BTree + nota very cheap check is it a 
> system keyspace.
> Diff:
> !image-2025-03-20-22-43-09-044.png|width=300!
> We have several places in code which uses TableMetadataRef#get and assume a 
> low cost for it.
> Currently we have about 0.93% of CPU spent for this operation in total. If we 
> check percentage for (compaction + flush) threads - it is 5.4% and 9.4% for 
> compaction only ( [^5.1_cpu.html] ).
> Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but 
> we also can avoid them in many cases by a small adjustment of a logic on an 
> invoker side to avoid too frequent usage of TableMetadataRef#get:
> 1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default 
> row cache is fully disabled - probably it is better to check if it is enabled 
> as a first condition:
> !image-2025-03-20-22-46-34-571.png|width=300!
> 2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup 
> metadata once at the beginning of getFlushSet logic
> !image-2025-03-20-22-52-40-818.png|width=300! 
> !image-2025-03-20-22-53-25-001.png|width=300!
> 3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think 
> if we can retrieve TableMetadata at the beginning a compaction and use during 
> it..
> !image-2025-03-20-22-56-31-298.png|width=300!
> 4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to 
> think if we can retrieve only needed id/indexName fields once (at leas t and 
> id does not look like a dynamically changed parameter ..)
> !image-2025-03-20-22-58-00-837.png|width=300!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-20465) Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get usage

Reply via email to