Hello, On Wed, 17 Aug 2016 16:54:41 -0500 Dan Jakubiec wrote:
> Hi Wido, > > Thank you for the response: > > > On Aug 17, 2016, at 16:25, Wido den Hollander <w...@42on.com> wrote: > > > > > >> Op 17 augustus 2016 om 17:44 schreef Dan Jakubiec <dan.jakub...@gmail.com>: > >> > >> > >> Hello, we have a Ceph cluster with 8 OSD that recently lost power to all 8 > >> machines. We've managed to recover the XFS filesystems on 7 of the > >> machines, but the OSD service is only starting on 1 of them. > >> > >> The other 5 machines all have complaints similar to the following: > >> > >> 2016-08-17 09:32:15.549588 7fa2f4666800 -1 > >> filestore(/var/lib/ceph/osd/ceph-1) Error initializing leveldb : > >> Corruption: 6 missing files; e.g.: > >> /var/lib/ceph/osd/ceph-1/current/omap/042421.ldb > >> That looks bad. And as Wido said, this shouldn't happen. What are your XFS mount options for that FS? I tend to remember seeing "nobarrier" in many OSD examples... > >> How can we repair the leveldb to allow the OSDs to startup? Hopefully somebody with a leveldb clue will pipe up, but I have grave doubts. > >> > > > > My first question would be: How did this happen? > > > > What hardware are you using underneath? Is there a RAID controller which is > > not flushing properly? Since this should not happen during a power failure. > > > > Each OSD drive is connected to an onboard hardware RAID controller and > configured in RAID 0 mode as individual virtual disks. The RAID controller > is an LSI 3108. > What are the configuration options? If there is no BBU and the controller is forcibly set to writeback caching, this would explain it, too. > I agree -- I am finding it bizarre that 7 of our 8 OSDs (one per machine) did > not survive the power outage. > My philosophy on this is that if any of DCs we're in should suffer a total and abrupt power loss I won't care, as I'll be buried below tons of concrete (this being Tokyo). In a place were power outages are more likely, I'd put local APU in front of stuff and issues a remote shutdown from it when it starts to run out of juice. Having a HW/SW combo that can survive a sudden power loss is nice, having something in place that softly shuts down things before that is a lot better. > We did have some problems with the stock Ubunut xfs_repair (3.1.9) seg > faulting, which eventually we overcame by building a newer version of > xfs_repair (4.7.0). But it did finally repair clean. > That also doesn't instill me with confidence, both Ubuntu and XFS wise. > We actually have some different errors on other OSDs. A few of them are > failing with "Missing map in load_pgs" errors. But generally speaking it > appears to be missing files of various types causing different kinds of > failures. > > I'm really nervous now about the OSD's inability to start with any > inconsistencies and no repair utilities (that I can find). Any advice on how > to recover? > What I've seen in the past assumes that you have at least a running cluster of sorts, just trashed PGs. This is far worse. Christian > > I don't know the answer to your question, but lost files are not good. > > > > You might find them in a lost+found directory if XFS repair worked? > > > > Sadly this directory is empty. > > -- Dan > > > Wido > > > >> Thanks, > >> > >> -- Dan J_______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Christian Balzer Network/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com