Hi Wido,

Thank you for the response:

> On Aug 17, 2016, at 16:25, Wido den Hollander <w...@42on.com> wrote:
> 
> 
>> Op 17 augustus 2016 om 17:44 schreef Dan Jakubiec <dan.jakub...@gmail.com>:
>> 
>> 
>> Hello, we have a Ceph cluster with 8 OSD that recently lost power to all 8 
>> machines.  We've managed to recover the XFS filesystems on 7 of the 
>> machines, but the OSD service is only starting on 1 of them.
>> 
>> The other 5 machines all have complaints similar to the following:
>> 
>>      2016-08-17 09:32:15.549588 7fa2f4666800 -1 
>> filestore(/var/lib/ceph/osd/ceph-1) Error initializing leveldb : Corruption: 
>> 6 missing files; e.g.: /var/lib/ceph/osd/ceph-1/current/omap/042421.ldb
>> 
>> How can we repair the leveldb to allow the OSDs to startup?  
>> 
> 
> My first question would be: How did this happen?
> 
> What hardware are you using underneath? Is there a RAID controller which is 
> not flushing properly? Since this should not happen during a power failure.
> 

Each OSD drive is connected to an onboard hardware RAID controller and 
configured in RAID 0 mode as individual virtual disks.  The RAID controller is 
an LSI 3108.

I agree -- I am finding it bizarre that 7 of our 8 OSDs (one per machine) did 
not survive the power outage.  

We did have some problems with the stock Ubunut xfs_repair (3.1.9) seg 
faulting, which eventually we overcame by building a newer version of 
xfs_repair (4.7.0).  But it did finally repair clean.

We actually have some different errors on other OSDs.  A few of them are 
failing with "Missing map in load_pgs" errors.  But generally speaking it 
appears to be missing files of various types causing different kinds of 
failures.

I'm really nervous now about the OSD's inability to start with any 
inconsistencies and no repair utilities (that I can find).  Any advice on how 
to recover?

> I don't know the answer to your question, but lost files are not good.
> 
> You might find them in a lost+found directory if XFS repair worked?
> 

Sadly this directory is empty.

-- Dan

> Wido
> 
>> Thanks,
>> 
>> -- Dan J_______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to