[ceph-users] New user with some questions...

Matthias Riße Thu, 13 Nov 2025 06:59:20 -0800

Hey all!

I am evaluating Ceph at work and have some questions to people with more experience with it, to avoid making dumb mistakes... I hope this is the right place to ask.

So, to give some context, we have 5 machines with about 400 TiB of raw storage total in the form of 8 HDDs per host. The hardware is not all the same, two nodes have a rather low core count which would be dedicated entirely to Ceph, the other three I would like to use in a HCI fashion to run VMs off of Ceph. I'd like to at least be able to take one node at a time out of rotation for maintenance purposes. I'd be using cephadm for the setup, which in my testing works really well.


As for the questions:

1. 3x replication is a tough sell IMO. For better storage utilization I am looking at EC. 2+2 would be an obvious choice, and seems to be rather performant. But what about MSR with higher values, e.g. 5+3 with 2 OSDs per host across 4 hosts? Or taken to more of an extreme, 17+7 with 6 OSDs across 4 hosts? This should also allow me to take one host out for maintenance at a time, but has a better storage utilization. Is this a stupid idea? (16+8 might make more sense, to also be able to sustain a disk failure while in maintenance.)

2. I've read that I can update the crush rule of EC pools after the fact, to change both the failure domain as well as the device class of the pool. What about changing k, m, or the plugin type? My understanding is that this is not supported, but Ceph didn't stop me from doing it and it seems to do /something/ when those values are changed?

3. Right now we are using libvirt with qcow2 images on local storage. I know that with Ceph the commonly recommended way would be to use RBD instead, but we have an existing proprietary tape archive for backup purposes whose official client can to my knowledge only do file-based backups, and around which we already have a system to back up live snapshots of VMs. How bad of an idea would it actually be to use qcow2 on top of (kernel- or fuse-mounted?) CephFS? So far it seems to perform on par with RBD in my testing, but both also seem to fully saturate the single OSDs per host I am testing with anyway.

4. In v20 there seems to be a new ec_optimizations feature for ec pools used in CephFS or RBD. Are those a good idea with this kind of (large) qcow2 image workload on top of CephFS?

5. Speaking of v20, while it is not yet the latest "active" version extrapolating the releases suggests that it could soon become that. Should I be waiting for it / start a new cluster with v20 now already?


Thanks for any insights you can give!

Kind regards
Matthias Riße

--
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Jülich GmbH
52425 Jülich
Sitz der Gesellschaft: Jülich
Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Stefan Müller
Geschäftsführung: Prof. Dr. Astrid Lambrecht (Vorsitzende),
Dr. Stephanie Bauer (stellvertretende Vorsitzende),
Prof. Dr. Ir. Pieter Jansens, Prof. Dr. Laurens Kuipers
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] New user with some questions...

Reply via email to