[ceph-users] Re: Cluster recovery: DC power failure killed OSD node BlueStore block.DB devices

Igor Fedotov Wed, 23 Apr 2025 01:44:28 -0700

Hi Paul,

have I got your idea correct - you're trying to attach new empty DBvolumes to existing OSDs in an attempt to recover these OSDs, right? Andoriginal SSD drives which kept DB volumes are physically dead?

If so than IMO it's a way to nowhere, "recovered" OSDs wouldn't runwithout original metadata. This is rather a waste of time..



Thanks,

Igor

On 23.04.2025 0:11, Paul Browne wrote:

Hi ceph-users,

We recently suffered a total power failure at our main DC; fortunately, our 
production Ceph cluster emerged unscathed but a smaller Ceph cluster came back 
with the majority of its dedicated SSD BlueStore block.DB devices dead (but its 
HDD OSD devices unharmed). This cluster underpinned a small OpenStack cloud, so 
it would be preferable to recover it rather than writing it off.

In terms of deployment tooling, this ailing Ceph cluster is a fairly standard Red 
Hat Ceph Storage 7 (so Quincy) cephadm deployed cluster, with the main wrinkle 
about it being that both the DB and HDD OSD devices make use of the cephadm 
supported LVM->LUKS layering above the BlueStore devices.

The dead BlueStore block.DB devices are of course blocking the surviving HDD 
OSD daemons (in cephadm deployed containers) from coming up cleanly and so the 
Ceph cluster status is currently very degraded (attached status for the ugly 
picture)

I've kicked around some ideas of recovering the dead DB devices and restarting 
down+out OSDs by;

  * Manually partitioning replacement SSDs into new DB device partitions+LVs
   * Installing the same LUKS keys on them retrieved from the Ceph config DB, 
matching up against which OSD is on which OSD host.

   *
Manually changing over device links for OSDs to their DB device with 
"ceph-bluestore-tool bluefs-bdev-new-db" or similar
   *
Try restarting OSDs with updated links to new LVM->LUKS->block.DB devices

This approach seems highly messy and subject to needing to extract a lot of 
information error-free from dumps of ceph volume lvm, list in order to exactly 
match extant OSD UUIDs to newly created DB devicemapper devices.

Is there going to be some smarter/better/faster way to non-destructively 
recover these intact HDD OSDs which have links to dead block.DB devices, using 
native cephadm tooling rather than getting so low-level as all the above?

Many thanks for any advice,

*******************
Paul Browne
Research Computing Platforms
University Information Services
Roger Needham Building
JJ Thompson Avenue
University of Cambridge
Cambridge
United Kingdom
E-Mail: pf...@cam.ac.uk<mailto:pf...@cam.ac.uk>
Tel: 0044-1223-746548
*******************
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Cluster recovery: DC power failure killed OSD node BlueStore block.DB devices

Reply via email to