Apparently those UUIDs aren't as reliable as I thought.
I've had problems with a server box that hosts a ceph VM. Looks like the
mobo disk controller is unreliable AND one of the disks passes SMART but
has interface problems. So I moved the disks to an alternate box.
Between relocation and dropping the one disk, neither of the 2 OSDs for
that host will come up. If everything was running solely on static
UUIDs, the good disk should have been findable even if its physical disk
device name shifted. But it wasn't.
Which brings up something I've wondered about for some time. Shouldn't
it be possible for OSDs to be portable? That is, if a box goes bad, in
theory I should be able to remove the drive and jack it into a hot-swap
bay on another server and have that server able to import the relocated OSD.
True, the metadata for an OSD is currently located on its host, but it
seems like it should be possible to carry a copy on the actual device.
Tim
On 4/11/25 16:23, Anthony D'Atri wrote:
Filestore, pre-ceph-volume may have been entirely different. IIRC LVM is used
these days to exploit persistent metadata tags.
On Apr 11, 2025, at 4:03 PM, Tim Holloway <t...@mousetech.com> wrote:
I just checked an OSD and the "block" entry is indeed linked to storage using a /dev/mapper uuid LV, not a /dev/device.
When ceph builds an LV-based OSD, it creates a VG whose name is "ceph-uuuuu", where "uuuu" is a UUID, and an
LV named "osd-block-vvvv", where "vvvv" is also a uuid. So although you'd map the osd to something like
/dev/vdb in a VM, the actual name ceph uses is uuid-based (and lvm-based) and thus not subject to change with alterations in the
hardware as the uuids are part of the metadata in VGs and LVs created by ceph.
Since I got that from a VM, I can't vouch for all cases, but I thought it
especially interesting that a ceph was creating LVM counterparts even for
devices that were not themselves LVM-based.
And yeah, I understand that it's the amount of OSD replicate data that counts
more than the number of hosts, but when an entire host goes down and there are
few hosts, that can take a large bite out of the replicas.
Tim
On 4/11/25 10:36, Anthony D'Atri wrote:
I thought those links were to the by-uuid paths for that reason?
On Apr 11, 2025, at 6:39 AM, Janne Johansson <icepic...@gmail.com> wrote:
Den fre 11 apr. 2025 kl 09:59 skrev Anthony D'Atri <anthony.da...@gmail.com>:
Filestore IIRC used partitions, with cute hex GPT types for various states and
roles. Udev activation was sometimes problematic, and LVM tags are more
flexible and reliable than the prior approach. There no doubt is more to it
but that’s what I recall.
Filestore used to have softlinks towards the journal device (if used)
which pointed to sdX where that X of course would jump around if you
changed the number of drives on the box, or the kernel disk detection
order changed, breaking the OSD.
--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io