[ 
https://issues.apache.org/jira/browse/CASSANDRA-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18038180#comment-18038180
 ] 

Dmitry Konstantinov commented on CASSANDRA-20226:
-------------------------------------------------

{quote}CI w/ the full "post-commit" profile: 
[http://ci-cassandra.infra.datastax.com/job/cassandra/52/]
{quote}
All failures are flaky or not related to the changes:
* dtest-latest.json_test.TestFromJsonSelect test_selecting_pkey_as_json - not 
related to the changes, fails as well in 
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* dtest-latest.client_request_metrics_test.TestClientRequestMetrics 
test_client_request_metrics - fails consistenly on 
[https://ci-cassandra.apache.org/job/Cassandra-trunk/]
* dtest-latest.cqlsh_tests.test_cqlsh.TestCqlsh test_describe_round_trip - 
local re-run passed
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy 
test_writing_with_max_output_size - local re-run passed
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy 
test_boolstyle_round_trip - not related to the changes, CQLSH formatting issue
* dtest-latest.cqlsh_tests.test_cqlsh.TestCqlshUnicode test_cqlsh_input_cmdline 
- local re-run passed
* dtest-latest.json_test.TestFromJsonUpdate test_complex_data_types - not 
related to the changes, protocol level issues: ConnectionShutdown('CRC mismatch 
on header
* dtest-latest.json_test.TestJsonFullRowInsertSelect test_complex_schema - not 
related to the changes, protocol level issues: ConnectionShutdown('CRC mismatch 
on header
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy 
test_source_copy_round_trip - local re-run passed
* 
dtest-upgrade-novnode.upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_indev_4_0_x_To_indev_trunk
 test_map_item_conditional - flaky test, similar issues in 
[https://ci-cassandra.apache.org/job/Cassandra-trunk/2304]
* dtest.materialized_views_test.TestMaterializedViews 
test_complex_mv_select_statements - not related to the changes, observed in 
other CI runs as well, CASSANDRA-17886
* dtest.json_test.TestFromJsonUpdate test_collection_update - not related to 
the changes, protocol level issues: ConnectionShutdown('CRC mismatch on header
* dtest.cqlsh_tests.test_cqlsh.TestCqlsh test_describe_functions - not related 
to the changes, protocol level issues: ConnectionShutdown('CRC mismatch on 
header
* dtest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy test_reading_with_ttl - local 
re-run passed
* db.compaction.LeveledGenerationsTest testEmptyLevel-latest_jdk11_x86_64 - 
fails on [https://ci-cassandra.apache.org/job/Cassandra-trunk/]
* distributed.upgrade.ClusterMetadataUpgradeHarryTest 
simpleUpgradeTest-_jdk11_x86_64
* fuzz.sai.MultiNodeSAITest indexOnlySaiTest-cassandra.testtag_IS_UNDEFINED - 
known flaky, CASSANDRA-20307
* service.accord.serializers.CommandsForKeySerializerTest 
test-latest_jdk11_x86_64 - not related to the changes, test issue: 
IllegalArgumentException: Could not generate a unique value after 10k attempts 
at accord.utils.Gens
* simulator.test.AccordHarrySimulationTest test-_jdk11_x86_64 - not related to 
the changes, fails as well in 
[https://ci-cassandra.apache.org/job/Cassandra-trunk/2318/#showFailuresLink]
* simulator.test.HarrySimulatorTest test-_jdk11_x86_64 - not related to the 
changes, fails as well in 
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* simulator.test.ShortAccordSimulationTest simulationTest-_jdk11_x86_64 - not 
related to the changes, fails as well in 
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* simulator.test.ShortPaxosSimulationTest 
selfReconcileTest-cassandra.testtag_IS_UNDEFINED - not related to the changes, 
fails as well in [https://ci-cassandra.apache.org/job/Cassandra-trunk/2318]

> Reduce contention in MemtableAllocator.allocate
> -----------------------------------------------
>
>                 Key: CASSANDRA-20226
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20226
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Memtable
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: 5.1_batch_LongAdder.html, 5.1_batch_addAndGet.html, 
> 5.1_batch_alloc_batching.html, 5.1_batch_baseline.html, 
> 5.1_batch_pad_allocated.html, CASSANDRA-20226_ci_summary.htm, 
> CASSANDRA-20226_results_details.tar.xz, 
> ci_summary_netudima_CASSANDRA-20226-trunk_52.html, cpu_profile_batch.html, 
> image-2025-01-20-23-38-58-896.png, image-2025-11-10-00-04-57-497.png, 
> profile.yaml, results_details_netudima_CASSANDRA-20226-trunk_52.tar.xz, 
> test_results_m8i.4xlarge_heap_buffers.html, 
> test_results_m8i.4xlarge_heap_buffers.png, 
> test_results_m8i.4xlarge_offheap_objects.html, 
> test_results_m8i.4xlarge_offheap_objects.png
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> For a high insert batch rate it looks like we have a bottleneck in 
> NativeAllocator.allocate probably caused by contention within the logic.
> !image-2025-01-20-23-38-58-896.png|width=300!
> [^cpu_profile_batch.html]
> The logic has at least the following 2 potential places to assess:
>  # allocation cycle in MemtablePool.SubPool#tryAllocate. This logic has a 
> while loop with a CAS, which can be non-efficient under a high contention, 
> similar to CASSANDRA-15922 we can try to replace it with addAndGet (need to 
> check if it does not break the allocator logic)
>  # swap region logic in NativeAllocator.trySwapRegion (under a high insert 
> rate 1MiB regions can be swapped quite frequently)
> Reproducing test details:
>  * test logic
> {code:java}
> ./tools/bin/cassandra-stress "user profile=./profile.yaml no-warmup 
> ops(insert=1) n=10m" -rate threads=100  -node somenode
> {code}
>  * Cassandra version: 5.0.3
>  * configuration changes compared to default:
> {code:java}
> memtable_allocation_type: offheap_objects
> memtable:
>   configurations:
>     skiplist:
>       class_name: SkipListMemtable
>     trie:
>       class_name: TrieMemtable
>       parameters:
>              shards: 32
>     default:
>       inherits: trie 
> {code}
>  * 1 node cluster
>  * OpenJDK jdk-17.0.12+7
>  * Linux kernel: 4.18.0-240.el8.x86_64
>  * CPU: 16 cores, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
>  * RAM: 46GiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to