> The failed SSD disks seem to be quite dead unfortunately, not visible to the > OS and also marked as dead in the node iDRAC BMC.
I’ve found that iDRAC’s view of SSDs is sometimes … imperfect, but not visible to the OS is telling. If you could send me storcli64 /c0 show termlog >/var/tmp/termlog.txt # or perccli64 storcli64 /c0 show all I’d love to take a look and see if the HBA has any additional information. One possible though unlikely scenario is that the lost drives had a firmware flaw but surviving drives had a newer revision > We haven't tried moving them to a different node to test though, I can try > that. > > In this power event we lost all of the SSD devices on 2 out of 3 OSD nodes in > the cluster (it was a small testing cluster) and half of them on the 3rd OSD > node. > > So the vast majority of OSDs can't start here and the overall cluster state > is extremely degraded. > > So if there is state contained within the old, dead DB devices that can't be > directly replaced with the instantiation of new replacement DB devices, then > it's looking like we've just lost too many DB devices in one foul swoop to > ever recover this Ceph cluster, despite the OSD HDDs all being > clean+untouched by the power event. > > I had been hoping that the DB state was more ephemeral than it seems to be, > and so instantiation of new DB devices mapped to the correct OSD devices (via > LUKS key) would allow for restarting the down+out OSD devices. But that's > increasingly looking to not be possible, from updates on this thread. > > ******************* > Paul Browne > Research Computing Platforms > University Information Services > Roger Needham Building > JJ Thompson Avenue > University of Cambridge > Cambridge > United Kingdom > E-Mail: pf...@cam.ac.uk<mailto:pf...@cam.ac.uk> > Tel: 0044-1223-746548 > ******************* > ________________________________ > From: Frédéric Nass <frederic.n...@univ-lorraine.fr> > Sent: 23 April 2025 11:24 > To: Paul Browne <pf...@cam.ac.uk> > Cc: ceph-users <ceph-users@ceph.io> > Subject: Re: [ceph-users] Cluster recovery: DC power failure killed OSD node > BlueStore block.DB devices > > Hi Paul, > > Could you provide more details about the 'SSD BlueStore block.DB devices > dead' issue? > > Are these devices not seen or seen as defective at the hardware level > (through iLO, iDrac, etc.)? Or are they visible to the operating system but > their associated OSDs are failing to start? > If you can't bring these RocksDB devices back online, associated OSDs will be > permanently dead. > > Regards, > Frédéric. > > ----- Le 22 Avr 25, à 23:11, Paul Browne pf...@cam.ac.uk a écrit : > >> Hi ceph-users, >> >> We recently suffered a total power failure at our main DC; fortunately, our >> production Ceph cluster emerged unscathed but a smaller Ceph cluster came >> back >> with the majority of its dedicated SSD BlueStore block.DB devices dead (but >> its >> HDD OSD devices unharmed). This cluster underpinned a small OpenStack cloud, >> so >> it would be preferable to recover it rather than writing it off. >> >> In terms of deployment tooling, this ailing Ceph cluster is a fairly standard >> Red Hat Ceph Storage 7 (so Quincy) cephadm deployed cluster, with the main >> wrinkle about it being that both the DB and HDD OSD devices make use of the >> cephadm supported LVM->LUKS layering above the BlueStore devices. >> >> The dead BlueStore block.DB devices are of course blocking the surviving HDD >> OSD >> daemons (in cephadm deployed containers) from coming up cleanly and so the >> Ceph >> cluster status is currently very degraded (attached status for the ugly >> picture) >> >> I've kicked around some ideas of recovering the dead DB devices and >> restarting >> down+out OSDs by; >> >> * Manually partitioning replacement SSDs into new DB device partitions+LVs >> * Installing the same LUKS keys on them retrieved from the Ceph config DB, >> matching up against which OSD is on which OSD host. >> >> * >> Manually changing over device links for OSDs to their DB device with >> "ceph-bluestore-tool bluefs-bdev-new-db" or similar >> * >> Try restarting OSDs with updated links to new LVM->LUKS->block.DB devices >> >> This approach seems highly messy and subject to needing to extract a lot of >> information error-free from dumps of ceph volume lvm, list in order to >> exactly >> match extant OSD UUIDs to newly created DB devicemapper devices. >> >> Is there going to be some smarter/better/faster way to non-destructively >> recover >> these intact HDD OSDs which have links to dead block.DB devices, using native >> cephadm tooling rather than getting so low-level as all the above? >> >> Many thanks for any advice, >> >> ******************* >> Paul Browne >> Research Computing Platforms >> University Information Services >> Roger Needham Building >> JJ Thompson Avenue >> University of Cambridge >> Cambridge >> United Kingdom >> E-Mail: pf...@cam.ac.uk<mailto:pf...@cam.ac.uk> >> Tel: 0044-1223-746548 >> ******************* >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io