> Apparently those UUIDs aren't as reliable as I thought. > > I've had problems with a server box that hosts a ceph VM.
VM? > Looks like the mobo disk controller is unreliable Lemme guess, it is an IR / RoC / RAID type? As opposed to JBOB / IT? If the former and it’s an LSI SKU as most are, I’d love if you could send me privately the output of storcli64 /c0 show termlog >/tmp/termlog.txt Sometimes flakiness is actually with the drive backplane, especially when it has an embedded expander. In either case, updating HBA firmware sometimes makes a real difference. And drive firmware. > AND one of the disks passes SMART I’m curious if it shows SATA downshifts. > but has interface problems. So I moved the disks to an alternate box. > > Between relocation and dropping the one disk, neither of the 2 OSDs for that > host will come up. If everything was running solely on static UUIDs, the good > disk should have been findable even if its physical disk device name shifted. > But it wasn't. Did you try ceph-volume lvm activate —all ? > Which brings up something I've wondered about for some time. Shouldn't it be > possible for OSDs to be portable? I haven’t tried it much, but that *should* be true, modulo CRUSH location. > That is, if a box goes bad, in theory I should be able to remove the drive > and jack it into a hot-swap bay on another server and have that server able > to import the relocated OSD. I’ve effectively done a chassis swap, moving all the drives including the boot volume, but that admittedly was in the ceph-disk days. > True, the metadata for an OSD is currently located on its host, but it seems > like it should be possible to carry a copy on the actual device. My limited understanding is that *is* the case with LVM. > > Tim > > On 4/11/25 16:23, Anthony D'Atri wrote: >> Filestore, pre-ceph-volume may have been entirely different. IIRC LVM is >> used these days to exploit persistent metadata tags. >> >>> On Apr 11, 2025, at 4:03 PM, Tim Holloway <t...@mousetech.com> wrote: >>> >>> I just checked an OSD and the "block" entry is indeed linked to storage >>> using a /dev/mapper uuid LV, not a /dev/device. When ceph builds an >>> LV-based OSD, it creates a VG whose name is "ceph-uuuuu", where "uuuu" is a >>> UUID, and an LV named "osd-block-vvvv", where "vvvv" is also a uuid. So >>> although you'd map the osd to something like /dev/vdb in a VM, the actual >>> name ceph uses is uuid-based (and lvm-based) and thus not subject to change >>> with alterations in the hardware as the uuids are part of the metadata in >>> VGs and LVs created by ceph. >>> >>> Since I got that from a VM, I can't vouch for all cases, but I thought it >>> especially interesting that a ceph was creating LVM counterparts even for >>> devices that were not themselves LVM-based. >>> >>> And yeah, I understand that it's the amount of OSD replicate data that >>> counts more than the number of hosts, but when an entire host goes down and >>> there are few hosts, that can take a large bite out of the replicas. >>> >>> Tim >>> >>> On 4/11/25 10:36, Anthony D'Atri wrote: >>>> I thought those links were to the by-uuid paths for that reason? >>>> >>>>> On Apr 11, 2025, at 6:39 AM, Janne Johansson <icepic...@gmail.com> wrote: >>>>> >>>>> Den fre 11 apr. 2025 kl 09:59 skrev Anthony D'Atri >>>>> <anthony.da...@gmail.com>: >>>>>> Filestore IIRC used partitions, with cute hex GPT types for various >>>>>> states and roles. Udev activation was sometimes problematic, and LVM >>>>>> tags are more flexible and reliable than the prior approach. There no >>>>>> doubt is more to it but that’s what I recall. >>>>> Filestore used to have softlinks towards the journal device (if used) >>>>> which pointed to sdX where that X of course would jump around if you >>>>> changed the number of drives on the box, or the kernel disk detection >>>>> order changed, breaking the OSD. >>>>> >>>>> -- >>>>> May the most significant bit of your life be positive. >>>>> _______________________________________________ >>>>> ceph-users mailing list -- ceph-users@ceph.io >>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io