[ceph-users] Adding OSD with separate DB via "ceph orch daemon add osd"

Ryan Rempel Tue, 27 May 2025 09:20:56 -0700

I'm expanding a small Ceph cluster from 4 nodes to 5 nodes. The new node is a 
bit more sophisticated than the others, since it has some SSD storage that I'd 
like to use for DB+WAL (which I haven't done before, it has just been 
rotational disks).


I'm using cephadm for orchestration, and normally add osds via "ceph orch 
daemon add osd". I prefer to add the osds in this "manual" way (rather than 
"ceph orch apply" with a spec) mainly because my infrastructure is not uniform 
(for better or worse, I'm working with hardware that becomes available in 
different ways over time, as I gradually upgrade things and add things).

Looking at this page:

https://docs.ceph.com/en/squid/cephadm/services/osd/

... it isn't entirely clear to me whether it's possible to specify a separate 
DB device when using the "ceph orch daemon add osd" procedure. There is a 
description of how to do it with a service spec, but how you would specify the 
DB device for "ceph orch daemon add osd" does not appear to be described.

So, my first question is whether it's possible to specify a separate DB via 
"ceph orch daemon add osd"?

If not, I'll need to explore the service spec approach. I suppose I can use the 
"unmanaged: true" option in the spec (to keep it as "manual" as possible).

The remaining puzzle is how to use the SSD as DB+WAL for more than one OSD. At 
this point, the SSD is the raw device — I haven't done anything with it 
manually in LVM or whatever. In the service spec description above, I see that 
there is a "db_slots" key. So, I suppose that I could specify the "whole" SSD 
and provide for the number of slots? However, I don't necessarily want every 
slot to be the same size (because of my unfortunately heterogeneous hardware). 
So, I also see that there is a "block_db_size" and "block_wal_size". But it's 
unclear how this relates to "db_slots" —  which one would determine how the SSD 
is sliced up?

I'd actually be happy to pre-slice the SSD (e.g. with  LVM) and then directly 
specify which SSD slice is the DB+WAL for which OSD, if that's a feasible 
approach. Though, I'd still be interested in knowing whether I need to set 
something for "block_db_size" and "block_wal_size", or whether it's enough to 
just actually make a certain size of LVM volume available for DB+WAL.

Normally I'd just experiment, but that might be disruptive to the working 
cluster. I guess I could at least turn off rebalancing while I try things out?

The other documentation I'm now reading is the documentation for ceph-volume, 
which appears to be related:

https://docs.ceph.com/en/squid/ceph-volume/lvm/batch/

It mentions, for instance, things like db_slots and block_db_size. The 
implication is that db_slots is an alternative to block_db_size — that you 
wouldn't specify both, for instance.

I'm also reading the ceph-volume docs for "prepare". I suppose if I find that 
more suitable, it might be possible to "prepare" and OSD with ceph-volume and 
then "adopt" it with cephadm?

Well, just writing the email has given me a bit more clarity about things to 
try, but I'd certainly be happy for any guidance.


Ryan Rempel

Director of Information Technology

Canadian Mennonite University
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Adding OSD with separate DB via "ceph orch daemon add osd"

Reply via email to