> I have seen drives work absolutely fine past their "100% used" indicator in > smart, but on a power cycle they flat out refuse to enumerate on the bus. > The team that ran into this was lucky enough to catch it on one machine first > so they could grab the data before rebooting the other hosts. This was also > many years ago so I hope their firmware does something different now.
In recent years the trend is to NOT hard-disable at the end of rated PE cycles, but this is still on a SKU-by-SKU basis. > > What does the health look like on the remaining drives? How long were the > dead ones in service? That SKU is from 2018, but was rated at 10DWPD so I suspect it’s not a lifetime issue as such, perhaps firmware. > > -paul > > -- > > Paul Mezzanini > Platform Engineer III > Research Computing > Rochester Institute of Technology > > > > ________________________________________ > From: Frédéric Nass <frederic.n...@univ-lorraine.fr> > Sent: Wednesday, April 23, 2025 10:09 AM > To: Paul Browne > Cc: ceph-users; Anthony D'Atri > Subject: [ceph-users] Re: Cluster recovery: DC power failure killed OSD node > BlueStore block.DB devices > > Hi Paul, > > I'm continuing this thread here, following Anthony's insightful remarks and > offer to help. > > I find it hard to believe that enterprise-grade NVMe drives would fail during > a power outage, unless there's an issue with the NVMe or HBA firmware. I > recommend opening a support case with DELL, HPE, or whatever manufacturer > made your server. > > Before doing that, try these troubleshooting steps: > > - Shut down the server completely > - Disconnect all power cables for at least 10 minutes > - Restart the server (this might resolve temporary discovery issues during > boot) > > If the drives reappear during startup, you may need to 'import' them in the > boot process. Watch for a message on the console prompting you to do this. > If these steps don't help, try upgrading all firmware on the server. > > I've seen 'dead' DELL SSDs (Toshiba) come back to life after a firmware > upgrade, even when marked as dead in iDrac. See [1] for details. > > Ultimately, your best course of action is to open a support case with your > hardware manufacturer. > > Regards, > Frédéric. > > [1] https://www.spinics.net/lists/ceph-users/msg78647.html > > ----- Le 23 Avr 25, à 15:37, Anthony D'Atri anthony.da...@gmail.com a écrit : > >>> The failed SSD disks seem to be quite dead unfortunately, not visible to >>> the OS >>> and also marked as dead in the node iDRAC BMC. >> >> I’ve found that iDRAC’s view of SSDs is sometimes … imperfect, but not >> visible >> to the OS is telling. >> >> If you could send me >> >> storcli64 /c0 show termlog >/var/tmp/termlog.txt # or perccli64 >> storcli64 /c0 show all >> >> I’d love to take a look and see if the HBA has any additional information. >> >> One possible though unlikely scenario is that the lost drives had a firmware >> flaw but surviving drives had a newer revision >> >> >> >>> We haven't tried moving them to a different node to test though, I can try >>> that. >>> >>> In this power event we lost all of the SSD devices on 2 out of 3 OSD nodes >>> in >>> the cluster (it was a small testing cluster) and half of them on the 3rd OSD >>> node. >>> >>> So the vast majority of OSDs can't start here and the overall cluster state >>> is >>> extremely degraded. >>> >>> So if there is state contained within the old, dead DB devices that can't be >>> directly replaced with the instantiation of new replacement DB devices, then >>> it's looking like we've just lost too many DB devices in one foul swoop to >>> ever >>> recover this Ceph cluster, despite the OSD HDDs all being clean+untouched by >>> the power event. >>> >>> I had been hoping that the DB state was more ephemeral than it seems to be, >>> and >>> so instantiation of new DB devices mapped to the correct OSD devices (via >>> LUKS >>> key) would allow for restarting the down+out OSD devices. But that's >>> increasingly looking to not be possible, from updates on this thread. >>> >>> ******************* >>> Paul Browne >>> Research Computing Platforms >>> University Information Services >>> Roger Needham Building >>> JJ Thompson Avenue >>> University of Cambridge >>> Cambridge >>> United Kingdom >>> E-Mail: pf...@cam.ac.uk<mailto:pf...@cam.ac.uk> >>> Tel: 0044-1223-746548 >>> ******************* >>> ________________________________ >>> From: Frédéric Nass <frederic.n...@univ-lorraine.fr> >>> Sent: 23 April 2025 11:24 >>> To: Paul Browne <pf...@cam.ac.uk> >>> Cc: ceph-users <ceph-users@ceph.io> >>> Subject: Re: [ceph-users] Cluster recovery: DC power failure killed OSD node >>> BlueStore block.DB devices >>> >>> Hi Paul, >>> >>> Could you provide more details about the 'SSD BlueStore block.DB devices >>> dead' >>> issue? >>> >>> Are these devices not seen or seen as defective at the hardware level >>> (through >>> iLO, iDrac, etc.)? Or are they visible to the operating system but their >>> associated OSDs are failing to start? >>> If you can't bring these RocksDB devices back online, associated OSDs will >>> be >>> permanently dead. >>> >>> Regards, >>> Frédéric. >>> >>> ----- Le 22 Avr 25, à 23:11, Paul Browne pf...@cam.ac.uk a écrit : >>> >>>> Hi ceph-users, >>>> >>>> We recently suffered a total power failure at our main DC; fortunately, our >>>> production Ceph cluster emerged unscathed but a smaller Ceph cluster came >>>> back >>>> with the majority of its dedicated SSD BlueStore block.DB devices dead >>>> (but its >>>> HDD OSD devices unharmed). This cluster underpinned a small OpenStack >>>> cloud, so >>>> it would be preferable to recover it rather than writing it off. >>>> >>>> In terms of deployment tooling, this ailing Ceph cluster is a fairly >>>> standard >>>> Red Hat Ceph Storage 7 (so Quincy) cephadm deployed cluster, with the main >>>> wrinkle about it being that both the DB and HDD OSD devices make use of the >>>> cephadm supported LVM->LUKS layering above the BlueStore devices. >>>> >>>> The dead BlueStore block.DB devices are of course blocking the surviving >>>> HDD OSD >>>> daemons (in cephadm deployed containers) from coming up cleanly and so the >>>> Ceph >>>> cluster status is currently very degraded (attached status for the ugly >>>> picture) >>>> >>>> I've kicked around some ideas of recovering the dead DB devices and >>>> restarting >>>> down+out OSDs by; >>>> >>>> * Manually partitioning replacement SSDs into new DB device >>>> partitions+LVs >>>> * Installing the same LUKS keys on them retrieved from the Ceph config >>>> DB, >>>> matching up against which OSD is on which OSD host. >>>> >>>> * >>>> Manually changing over device links for OSDs to their DB device with >>>> "ceph-bluestore-tool bluefs-bdev-new-db" or similar >>>> * >>>> Try restarting OSDs with updated links to new LVM->LUKS->block.DB devices >>>> >>>> This approach seems highly messy and subject to needing to extract a lot of >>>> information error-free from dumps of ceph volume lvm, list in order to >>>> exactly >>>> match extant OSD UUIDs to newly created DB devicemapper devices. >>>> >>>> Is there going to be some smarter/better/faster way to non-destructively >>>> recover >>>> these intact HDD OSDs which have links to dead block.DB devices, using >>>> native >>>> cephadm tooling rather than getting so low-level as all the above? >>>> >>>> Many thanks for any advice, >>>> >>>> ******************* >>>> Paul Browne >>>> Research Computing Platforms >>>> University Information Services >>>> Roger Needham Building >>>> JJ Thompson Avenue >>>> University of Cambridge >>>> Cambridge >>>> United Kingdom >>>> E-Mail: pf...@cam.ac.uk<mailto:pf...@cam.ac.uk> >>>> Tel: 0044-1223-746548 >>>> ******************* >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@ceph.io >>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io