Hey Ki-taek,

Thank you for sharing the information!
Based on the results above, it appears that the allocated CPU might not
have been configured correctly.
Just to clarify: the "alienized" thread options are only applicable when
deploying Crimson with Bluestore as the backend.
Since Seastore is native to Crimson-OSD architecture, there's no need to
set additional options for CPU allocation—both use the same CPU set.
I'll update our documentation to better highlight this distinction,
especially as our goal is for Seastore to become the default object store
in Crimson.

For re-testing the CPU scaling behavior, I would recommend using the
crimson_seastar_num_threads option.
Other options default values should already provide performant behavior out
of the box.

We really appreciate early deployment feedback - and it’s highly valuable
to us.
Would you be able to share the updated results in the #crimson Slack
channel so we could discuss it more easily?

Thanks,
Matan

On Tue, Aug 12, 2025 at 9:43 AM Ki-taek Lee <ktlee4...@gmail.com> wrote:

> Hello Ceph community,
>
> I am evaluating Crimson OSD + Seastore performance for potential deployment
> in a distributed storage environment.
> With BlueStore, I have been able to achieve satisfying performance levels
> in my FIO tests for 4K random read/write IOPS.
>
> However, when testing Crimson OSD + Seastore, I observed that 4K random
> read/write IOPS do not scale as expected when increasing the number of
> SSDs/OSDs. The performance plateaus beyond a certain point or is much lower
> than expected. (See attached test results.)
>
> Test Environment:
> - Cluster: 8 clients, 1 OSD
> - Hardware: 40-core CPUs, 377 GiB DRAM
> - Image SHA (quay.io): e0543089a9e9cae97999761059eaccdf6bb22e9e
> - Configuration parameters:
>     osd_memory_target = 34359738368
>     crimson_osd_scheduler_concurrency = 0
>     seastore_max_concurrent_transactions = 16
>     crimson_osd_obc_lru_size = 8192
>     seastore_cache_lru_size = 16G
>     seastore_obj_data_write_amplification = 4
>     seastore_journal_batch_capacity = 1024
>     seastore_journal_batch_flush_size = 256M
>     seastore_journal_iodepth_limit = 16
>     seastore_journal_batch_preferred_fullness = 0.8
>     seastore_segment_size = 128M
>     seastore_device_size = 512G
>     seastore_block_create = true
>     seastore_default_object_metadata_reservation = 1073741824
>     rbd_cache = false
>     rbd_cache_writethrough_until_flush = true
>     rbd_op_threads = 16
>
> Replication policy:
> - 4096 PGs, no replication (only 1 copy)
>
> Test Results:
>
> 1 SSD test (varying number of allocated CPUs, alien threads = 26-29,
> 36-39):
> num CPU | 4k randread | 4k randwrite | Allocated CPU sets
> 2       | 126772      | 14830        | 0-1
> 4       | 107860      | 16451        | 0-3
> 6       | 113741      | 17019        | 0-5
> 8       | 132060      | 16099        | 0-7
>
> SSD scaling test (2 CPUs per SSD):
> OSD CPU mapping: OSD.0 (0-1), OSD.1 (10-11), OSD.2 (2-3), OSD.3 (12-13),
> ..., OSD.15 (34-35), Alien threads (26-29, 36-39)
> num SSD | 4k randread | 4k randwrite
> 4       | 861273      | 22360
> 8       | 1022793     | 22786
> 12      | 1019161     | 21211
> 16      | 927570      | 20502
>
> SSD scaling test (1 CPU per SSD):
> OSD CPU mapping: OSD.0 (0), OSD.1 (10), OSD.2 (2), OSD.3 (12), ..., OSD.15
> (24), Alien CPUs: 1, 11, 3, 13, ..., 15, 25
> num SSD | 4k randread | 4k randwrite
> 4       | 936685      | 13730
> 8       | 1048204     | 18259
> 12      | 922727      | 23078
> 16      | 987838      | 30792
>
> Questions:
> 1. Since Seastore is still under active development, are there any known
> unresolved performance issues that could explain this scaling behavior?
> 2. Are there recommended tuning parameters for improving small-block read
> scalability in multi-SSD configurations?
> 3. Regarding alien threads, are there best practices for CPU pinning or
> NUMA-aware placement that have shown measurable improvements?
> 4. Any additional guidance for maximizing IOPS with Crimson OSD + Seastore
> would be greatly appreciated.
>
> My goal is to be ready to switch from BlueStore to Crimson + Seastore after
> it becomes stable and shows reasonable performance compared to BlueStore,
> so I’d like to understand the current limitations and tuning opportunities.
>
> Thank you,
> Ki-taek Lee
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to