This thread is off in left field and needs to be brought back to how things work.
While multiple OSDs can use the same device for block/wal partitions, they each need their own partition. osd.0 could use nvme0n1p1, osd.2/nvme0n1p2, etc. You cannot use the same partition for each osd. Ceph-volume will not create the db/wal partitions for you, you need to manually create the partitions to be used by the OSD. There is no need to put a filesystem on top of the partition for the wal/db. That is wasted overhead that will slow things down. Back to the original email. > Or do I need to use osd-db=/dev/nvme0n1p2 for data=/dev/sdb, > osd-db=/dev/nvme0n1p3 for data=/dev/sdc, and so on? This is what you need to do, but like said above, you need to create the partitions for --block-db yourself. You talked about having a 10GB partition for this, but the general recommendation for block-db partitions is 10GB per 1TB of OSD. If your OSD is a 4TB disk you should be looking closer to a 40GB block.db partition. If your block.db partition is too small, then once it fills up it will spill over onto the data volume and slow things down. > And just to make sure - if I specify "--osd-db", I don't need > to set "--osd-wal" as well, since the WAL will end up on the > DB partition automatically, correct? This is correct. The wal will automatically be placed on the db if not otherwise specified. I don't use ceph-deploy, but the process for creating the OSDs should be something like this. After the OSDs are created it is a good idea to make sure that the OSD is not looking for the db partition with the /dev/nvme0n1p2 distinction as that can change on reboots if you have multiple nvme devices. # Make sure the disks are clean and ready to use as an OSD for hdd in /dev/sd{b..c}; do ceph-volume lvm zap $hdd --destroy done # Create the nvme db partitions (assuming 10G size for a 1TB OSD) for partition in {2..3}; do sgdisk -c /dev/nvme0n1 -n:$partition:0:+10G -c:$partition:'ceph db' done # Create the OSD echo "/dev/sdb /dev/nvme0n1p2 /dev/sdc /dev/nvme0n1p3" | while read hdd db; do ceph-volume lvm create --bluestore --data $hdd --block.db $db done # Fix the OSDs to look for the block.db partition by UUID instead of its device name. for db in /var/lib/ceph/osd/*/block.db; do dev=$(readlink $db | grep -Eo nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+ || echo false) if [[ "$dev" != false ]]; then uuid=$(ls -l /dev/disk/by-partuuid/ | awk '/'${dev}'$/ {print $9}') ln -sf /dev/disk/by-partuuid/$uuid $db fi done systemctl restart ceph-osd.target On Fri, May 11, 2018 at 10:59 AM João Paulo Sacchetto Ribeiro Bastos < joaopaulos...@gmail.com> wrote: > Actually, if you go to https://ceph.com/community/new-luminous-bluestore/ you > will see that DB/WAL work on a XFS partition, while the data itself goes on > a raw block. > > Also, I told you the wrong command in the last mail. When i said --osd-db > it should be --block-db. > > On Fri, May 11, 2018 at 11:51 AM Oliver Schulz < > oliver.sch...@tu-dortmund.de> wrote: > >> Hi, >> >> thanks for the advice! I'm a bit confused now, though. ;-) >> I thought DB and WAL were supposed to go on raw block >> devices, not file systems? >> >> >> Cheers, >> >> Oliver >> >> >> On 11.05.2018 16:01, João Paulo Sacchetto Ribeiro Bastos wrote: >> > Hello Oliver, >> > >> > As far as I know yet, you can use the same DB device for about 4 or 5 >> > OSDs, just need to be aware of the free space. I'm also developing a >> > bluestore cluster, and our DB and WAL will be in the same SSD of about >> > 480GB serving 4 OSD HDDs of 4 TB each. About the sizes, its just a >> > feeling because I couldn't find yet any clear rule about how to measure >> > the requirements. >> > >> > * The only concern that took me some time to realize is that you should >> > create a XFS partition if using ceph-deploy because if you don't it >> will >> > simply give you a RuntimeError that doesn't give any hint about what's >> > going on. >> > >> > So, answering your question, you could do something like: >> > $ ceph-deploy osd create --bluestore --data=/dev/sdb --block-db >> > /dev/nvme0n1p1 $HOSTNAME >> > $ ceph-deploy osd create --bluestore --data=/dev/sdc --block-db >> > /dev/nvme0n1p1 $HOSTNAME >> > >> > On Fri, May 11, 2018 at 10:35 AM Oliver Schulz >> > <oliver.sch...@tu-dortmund.de <mailto:oliver.sch...@tu-dortmund.de>> >> wrote: >> > >> > Dear Ceph Experts, >> > >> > I'm trying to set up some new OSD storage nodes, now with >> > bluestore (our existing nodes still use filestore). I'm >> > a bit unclear on how to specify WAL/DB devices: Can >> > several OSDs share one WAL/DB partition? So, can I do >> > >> > ceph-deploy osd create --bluestore --osd-db=/dev/nvme0n1p2 >> > --data=/dev/sdb HOSTNAME >> > >> > ceph-deploy osd create --bluestore --osd-db=/dev/nvme0n1p2 >> > --data=/dev/sdc HOSTNAME >> > >> > ... >> > >> > Or do I need to use osd-db=/dev/nvme0n1p2 for data=/dev/sdb, >> > osd-db=/dev/nvme0n1p3 for data=/dev/sdc, and so on? >> > >> > And just to make sure - if I specify "--osd-db", I don't need >> > to set "--osd-wal" as well, since the WAL will end up on the >> > DB partition automatically, correct? >> > >> > >> > Thanks for any hints, >> > >> > Oliver >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > -- >> > >> > João Paulo Sacchetto Ribeiro Bastos >> > +55 31 99279-7092 <+55%2031%2099279-7092> >> > >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > -- > > João Paulo Bastos > DevOps Engineer at Mav Tecnologia > Belo Horizonte - Brazil > +55 31 99279-7092 <+55%2031%2099279-7092> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com