[ https://issues.apache.org/jira/browse/CASSANDRA-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938297#comment-17938297 ]
Sam Tunnicliffe commented on CASSANDRA-20465: --------------------------------------------- I think that in theory it would be possible to more or less go back to how this worked in 5.0. That is, we could cache the {{TableMetadata}} in {{TableMetadataRef}} and bring back something akin to {{TableMetadataRefCache}} as a kind of registry of {{TableMetadataRef}} instances. One thing to be mindful of is that when a {{ColumnFamilyStore}} is created or reloaded following a schema change in {{Keyspace::initCf}} we have to be sure to use the new metadata, as at that point {{ClusterMetadata.current()}} will still return the previous version so any config changes won't be present there. This was one of the things that lead us to rework this area way back in the early days of CEP-21 development. Thankfully, I think we overcame most of the issues that seemed to complicate this back then, but we never got round to revisiting this to simplify it. Alternatively, it should be quite possible to make the existing implementation cheaper without quite so much rework by using a lookup {{HashMap}} in {{ClusterMetadata}} and having {{TableMetadataRef::get}} use that instead of {{Schema.getTableMetadata}} (to trade btree for hash lookups). The tricky part here would _probably_ be dealing with the prev/next issue that I mentioned above (though again, it may be that this is not really such an issue anymore). So on the one hand, I think there probably is mileage to be had in revisiting {{TableMetadataRef}} itself. However, I also think looking at the usage and callsites is perhaps even more valuable in the long term. The 4 optimisations you listed here seem entirely sensible and in general it seems reasonable that most operations would want to obtain a single {{TableMetadata}} and use that instance/version consistently for the entire duration. I'm happy to help if you want to tackle this. > Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get > usage > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-20465 > URL: https://issues.apache.org/jira/browse/CASSANDRA-20465 > Project: Apache Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Transactional Cluster Metadata > Reporter: Dmitry Konstantinov > Assignee: Dmitry Konstantinov > Priority: Normal > Fix For: 5.x > > Attachments: 5.1_cpu.html, image-2025-03-20-22-43-09-044.png, > image-2025-03-20-22-46-34-571.png, image-2025-03-20-22-52-40-818.png, > image-2025-03-20-22-53-25-001.png, image-2025-03-20-22-56-31-298.png, > image-2025-03-20-22-58-00-837.png > > > Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get > invocation was cheap (it just returned a field value), now it does a lookup > from Schema every time with a search a BTree + nota very cheap check is it a > system keyspace. > Diff: > !image-2025-03-20-22-43-09-044.png|width=300! > We have several places in code which uses TableMetadataRef#get and assume a > low cost for it. > Currently we have about 0.93% of CPU spent for this operation in total. If we > check percentage for (compaction + flush) threads - it is 5.4% and 9.4% for > compaction only ( [^5.1_cpu.html] ). > Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but > we also can avoid them in many cases by a small adjustment of a logic on an > invoker side to avoid too frequent usage of TableMetadataRef#get: > 1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default > row cache is fully disabled - probably it is better to check if it is enabled > as a first condition: > !image-2025-03-20-22-46-34-571.png|width=300! > 2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup > metadata once at the beginning of getFlushSet logic > !image-2025-03-20-22-52-40-818.png|width=300! > !image-2025-03-20-22-53-25-001.png|width=300! > 3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think > if we can retrieve TableMetadata at the beginning a compaction and use during > it.. > !image-2025-03-20-22-56-31-298.png|width=300! > 4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to > think if we can retrieve only needed id/indexName fields once (at leas t and > id does not look like a dynamically changed parameter ..) > !image-2025-03-20-22-58-00-837.png|width=300! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org