[ 
https://issues.apache.org/jira/browse/CASSANDRA-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Konstantinov updated CASSANDRA-21083:
--------------------------------------------
    Attachment: CASSANDRA-21088_draft_ci_summary.htm

> Optimize memtable flush logic
> -----------------------------
>
>                 Key: CASSANDRA-21083
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21083
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Memtable
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: CASSANDRA-21083.html, 
> CASSANDRA-21088_draft_ci_summary.htm, 
> CASSANDRA-21088_draft_results_details.tar.xz, profiles.zip
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Memtable flushing to disk impacts write performance and can be a limiting 
> factor for write throughput:
>  * If we cannot flush fast enough we have to limit writes to memtables due to 
> lack of available memory for them
>  * flushing logic can be CPU-intensive and complete with writing threads for 
> CPU by stealing 1-2 cores (or even more if memtable_flush_writers is set to a 
> higher value)
> Suggested optimisations:
>  # invoke MetadataCollector.updateClusteringValues only for first and last 
> clustering key in a partition, not for every row 
> ([link|https://github.com/apache/cassandra/commit/df2df1d0eefc8b603eafa87f42ed1975dfc46143])
>  # split call sites for in Cell.Serializer serialize logic to avoid 
> megamorphic calls + make cell.isCounterCell check cheaper (avoid megamorphic 
> call + pre-calculate isCounterColumn info) 
> ([link|https://github.com/apache/cassandra/commit/7653194932e9ffb966c0c1c1f76fbcf532f222a7])
>  # check if Guardrails enabled at the beginning of writing, not per row, 
> avoid hidden auto-boxing for logging of primitive parameters 
> ([link|https://github.com/apache/cassandra/commit/f17e835108ad6f282257e95992084a33e9d47b52])
>  # add fast return for BTreeRow.hasComplexDeletion if there was no deletions, 
> avoid column.name.bytes.hashCode if not needed, avoid capturing lambda 
> allocation in UnfilteredSerializer.serializeRowBody 
> ([link|https://github.com/apache/cassandra/commit/8c2d0f5a24a6e25832d7fae6668f01fbbccc285a])
>  # reduce allocations during serialization of NativeClustering 
> ([link|https://github.com/apache/cassandra/commit/457f2803efd2af1a919dbe56fe958627e5652fc2])
>  # do not re-map colums in serializeRowBody if they haven't changed 
> ([link|https://github.com/apache/cassandra/commit/c0f08ee437f5d468f83ff6a1c952182bc5b156a4])
>  # add flushing iterator without column filtering 
> ([link|https://github.com/apache/cassandra/commit/0f77206334320815dc80622122b40dc2f5c3f6fd])
>  # split call sites in MetadataCollector.update(Cell<?> cell) to improve cell 
> methods inlining and use monomorphic calls 
> ([link|https://github.com/apache/cassandra/commit/7fa2bf3ba4e11a2c4e7be421aba1295fb6738f18])



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to