[
https://issues.apache.org/jira/browse/CASSANDRA-20918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18042864#comment-18042864
]
Dmitry Konstantinov edited comment on CASSANDRA-20918 at 12/4/25 5:38 PM:
--------------------------------------------------------------------------
A summary of limitations for the current implementation ("not supported" means
the original iterator implementation is used):
* complex columns are not supported
* tables with secondary indexes are not supported
* BIG SSTable format is supported only (BTI is not supported)
* counters are not supported
* Murmur3Partitioner and LocalPartitioner are supported only
* nodetool garbagecollect is not supported
was (Author: dnk):
A summary of limitations for the current implementation ("not supported" means
the original iterator implementation is used):
* complex columns are not supported
* counters are not supported
* Murmur3Partitioner and LocalPartitioner are supported only
* Tables with secondary indexes are not supported
* BIG SSTable format supported only (BTI is not supported)
* nodetool garbagecollect is not supported
> Add cursor-based low allocation optimized compaction implementation
> -------------------------------------------------------------------
>
> Key: CASSANDRA-20918
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20918
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Local/Compaction, Local/SSTable
> Reporter: Josh McKenzie
> Assignee: Nitsan Wakart
> Priority: Normal
> Attachments: 7_100m_100kr_100r.png
>
> Time Spent: 5h
> Remaining Estimate: 0h
>
> Compaction does a ton of allocation and burns a lot of CPU in the process; we
> can move away from allocation with some fairly simple and straightforward
> reusable objects and infrastructure that make use of that, reducing
> allocation and thus CPU usage during compaction. Heap allocation on all
> test-cases holds steady at 20MB while regular compaction grows up past 5+GB.
> This patch introduces a collection of reusable objects:
> * ReusableLivenessInfo
> * ReusableDecoratedKey
> * ReusableLongToken
> And new compaction structures that make use of those objects:
> * CompactionCursor
> * CursorCompactionPipeline
> * SSTableCursorReader
> * SSTableCursorWriter
> There's quite a bit of test code added, benchmarks, etc on the linked branch.
> ~13k added, 405 lines deleted
> ~8.3k lines delta are non-test code
> ~5k lines delta are test code
> Attaching a screenshot of the "messiest" benchmark case with mixed size rows
> and full merge; across various data and compaction mixes the highlight is
> that compaction as implemented here is roughly 3-5x faster in most scenarios
> and uses 20mb on heap vs. multiple GB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]