[
https://issues.apache.org/jira/browse/CASSANDRA-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18038180#comment-18038180
]
Dmitry Konstantinov commented on CASSANDRA-20226:
-------------------------------------------------
{quote}CI w/ the full "post-commit" profile:
[http://ci-cassandra.infra.datastax.com/job/cassandra/52/]
{quote}
All failures are flaky or not related to the changes:
* dtest-latest.json_test.TestFromJsonSelect test_selecting_pkey_as_json - not
related to the changes, fails as well in
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* dtest-latest.client_request_metrics_test.TestClientRequestMetrics
test_client_request_metrics - fails consistenly on
[https://ci-cassandra.apache.org/job/Cassandra-trunk/]
* dtest-latest.cqlsh_tests.test_cqlsh.TestCqlsh test_describe_round_trip -
local re-run passed
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy
test_writing_with_max_output_size - local re-run passed
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy
test_boolstyle_round_trip - not related to the changes, CQLSH formatting issue
* dtest-latest.cqlsh_tests.test_cqlsh.TestCqlshUnicode test_cqlsh_input_cmdline
- local re-run passed
* dtest-latest.json_test.TestFromJsonUpdate test_complex_data_types - not
related to the changes, protocol level issues: ConnectionShutdown('CRC mismatch
on header
* dtest-latest.json_test.TestJsonFullRowInsertSelect test_complex_schema - not
related to the changes, protocol level issues: ConnectionShutdown('CRC mismatch
on header
* dtest-latest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy
test_source_copy_round_trip - local re-run passed
*
dtest-upgrade-novnode.upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_indev_4_0_x_To_indev_trunk
test_map_item_conditional - flaky test, similar issues in
[https://ci-cassandra.apache.org/job/Cassandra-trunk/2304]
* dtest.materialized_views_test.TestMaterializedViews
test_complex_mv_select_statements - not related to the changes, observed in
other CI runs as well, CASSANDRA-17886
* dtest.json_test.TestFromJsonUpdate test_collection_update - not related to
the changes, protocol level issues: ConnectionShutdown('CRC mismatch on header
* dtest.cqlsh_tests.test_cqlsh.TestCqlsh test_describe_functions - not related
to the changes, protocol level issues: ConnectionShutdown('CRC mismatch on
header
* dtest.cqlsh_tests.test_cqlsh_copy.TestCqlshCopy test_reading_with_ttl - local
re-run passed
* db.compaction.LeveledGenerationsTest testEmptyLevel-latest_jdk11_x86_64 -
fails on [https://ci-cassandra.apache.org/job/Cassandra-trunk/]
* distributed.upgrade.ClusterMetadataUpgradeHarryTest
simpleUpgradeTest-_jdk11_x86_64
* fuzz.sai.MultiNodeSAITest indexOnlySaiTest-cassandra.testtag_IS_UNDEFINED -
known flaky, CASSANDRA-20307
* service.accord.serializers.CommandsForKeySerializerTest
test-latest_jdk11_x86_64 - not related to the changes, test issue:
IllegalArgumentException: Could not generate a unique value after 10k attempts
at accord.utils.Gens
* simulator.test.AccordHarrySimulationTest test-_jdk11_x86_64 - not related to
the changes, fails as well in
[https://ci-cassandra.apache.org/job/Cassandra-trunk/2318/#showFailuresLink]
* simulator.test.HarrySimulatorTest test-_jdk11_x86_64 - not related to the
changes, fails as well in
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* simulator.test.ShortAccordSimulationTest simulationTest-_jdk11_x86_64 - not
related to the changes, fails as well in
[http://ci-cassandra.infra.datastax.com/job/cassandra/48/#showFailuresLink]
* simulator.test.ShortPaxosSimulationTest
selfReconcileTest-cassandra.testtag_IS_UNDEFINED - not related to the changes,
fails as well in [https://ci-cassandra.apache.org/job/Cassandra-trunk/2318]
> Reduce contention in MemtableAllocator.allocate
> -----------------------------------------------
>
> Key: CASSANDRA-20226
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20226
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Local/Memtable
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: 5.1_batch_LongAdder.html, 5.1_batch_addAndGet.html,
> 5.1_batch_alloc_batching.html, 5.1_batch_baseline.html,
> 5.1_batch_pad_allocated.html, CASSANDRA-20226_ci_summary.htm,
> CASSANDRA-20226_results_details.tar.xz,
> ci_summary_netudima_CASSANDRA-20226-trunk_52.html, cpu_profile_batch.html,
> image-2025-01-20-23-38-58-896.png, image-2025-11-10-00-04-57-497.png,
> profile.yaml, results_details_netudima_CASSANDRA-20226-trunk_52.tar.xz,
> test_results_m8i.4xlarge_heap_buffers.html,
> test_results_m8i.4xlarge_heap_buffers.png,
> test_results_m8i.4xlarge_offheap_objects.html,
> test_results_m8i.4xlarge_offheap_objects.png
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> For a high insert batch rate it looks like we have a bottleneck in
> NativeAllocator.allocate probably caused by contention within the logic.
> !image-2025-01-20-23-38-58-896.png|width=300!
> [^cpu_profile_batch.html]
> The logic has at least the following 2 potential places to assess:
> # allocation cycle in MemtablePool.SubPool#tryAllocate. This logic has a
> while loop with a CAS, which can be non-efficient under a high contention,
> similar to CASSANDRA-15922 we can try to replace it with addAndGet (need to
> check if it does not break the allocator logic)
> # swap region logic in NativeAllocator.trySwapRegion (under a high insert
> rate 1MiB regions can be swapped quite frequently)
> Reproducing test details:
> * test logic
> {code:java}
> ./tools/bin/cassandra-stress "user profile=./profile.yaml no-warmup
> ops(insert=1) n=10m" -rate threads=100 -node somenode
> {code}
> * Cassandra version: 5.0.3
> * configuration changes compared to default:
> {code:java}
> memtable_allocation_type: offheap_objects
> memtable:
> configurations:
> skiplist:
> class_name: SkipListMemtable
> trie:
> class_name: TrieMemtable
> parameters:
> shards: 32
> default:
> inherits: trie
> {code}
> * 1 node cluster
> * OpenJDK jdk-17.0.12+7
> * Linux kernel: 4.18.0-240.el8.x86_64
> * CPU: 16 cores, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
> * RAM: 46GiB
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]