Re: [ceph-users] Power outages!!! help!

Tomasz Kusmierz Mon, 28 Aug 2017 04:09:41 -0700

Personally I would suggest to:
- change minimal replication type to OSD (from default host)
- remove the OSD from the host with all those "down OSD’s" (note that they are 
down not out which makes it more weird)
- let single node cluster stabilise, yes performance will suck but at least you 
will have data on two copies on singular host … better this than nothing.
- fix whatever issues you have on host OSD2 
- add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will make 
ceph migrate all data away from host OSD1
- fix all the problem you’ve got on host OSD1


reason I suggest that is that is seems that you’ve got issues everywhere and 
since you are running a production environment (at least it seem like that to 
me) data and down time is main priority.

> On 28 Aug 2017, at 11:58, Ronny Aasen <[email protected]> wrote:
> 
> On 28. aug. 2017 08:01, hjcho616 wrote:
>> Hello!
>> I've been using ceph for long time mostly for network CephFS storage, even 
>> before Argonaut release!  It's been working very well for me.  Yes, I had 
>> some power outtages before and asked few questions on this list before and 
>> got resolved happily!  Thank you all!
>> Not sure why but we've been having quite a bit of power outages lately.   
>> Ceph appear to be running OK with those going on.. so I was pretty happy and 
>> didn't thought much of it... till yesterday, When I started to move some 
>> videos to cephfs, ceph decided that it was full although df showed only 54% 
>> utilization!  Then I looked up, some of the osds were down! (only 3 at that 
>> point!)
>> I am running pretty simple ceph configuration... I have one machine running 
>> MDS and mon named MDS1.  Two OSD machines with 5 2TB HDDs and 1 SSD for 
>> journal named OSD1 and OSD2.
>> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's 
>> log file and googled some of them... they appeared to be tied to version 
>> 10.2.2.  So I just upgraded all to 10.2.9.  Well that didn't solve my 
>> problems.. =P  While looking at some of this.. there was another power 
>> outage!  D'oh!  I may need to invest in a UPS or something... Until this 
>> happened, all of the osd down were from OSD2.   But OSD1 took a hit!  
>> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L /dev/sdb1 
>> as suggested by command line.. I was able to mount it again, phew, reboot... 
>> then /dev/sdb1 is no longer accessible!  Noooo!!!
>> So this is what I have today!  I am a bit concerned as half of the osds are 
>> down!  and osd.0 doesn't look good at all...
>> # ceph osd tree
>> ID WEIGHT   TYPE NAME     UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 16.24478 root default
>> -2  8.12239     host OSD1
>>  1  1.95250         osd.1      up  1.00000          1.00000
>>  0  1.95250         osd.0    down        0          1.00000
>>  7  0.31239         osd.7      up  1.00000          1.00000
>>  6  1.95250         osd.6      up  1.00000          1.00000
>>  2  1.95250         osd.2      up  1.00000          1.00000
>> -3  8.12239     host OSD2
>>  3  1.95250         osd.3    down        0          1.00000
>>  4  1.95250         osd.4    down        0          1.00000
>>  5  1.95250         osd.5    down        0          1.00000
>>  8  1.95250         osd.8    down        0          1.00000
>>  9  0.31239         osd.9      up  1.00000          1.00000
>> This looked alot better before that last extra power outage... =(  Can't 
>> mount it anymore!
>> # ceph health
>> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs 
>> backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 pgs 
>> inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 16 pgs 
>> stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 159 
>> pgs stuck unclean; 102 pgs stuck undersized; 102 pgs undersized; 1 requests 
>> are blocked > 32 sec; recovery 1803466/4503980 objects degraded (40.042%); 
>> recovery 692976/4503980 objects misplaced (15.386%); recovery 147/2251990 
>> unfound (0.007%); 1 near full osd(s); 54 scrub errors; mds cluster is 
>> degraded; no legacy OSD present but 'sortbitwise' flag is not set
>> Each of osds are showing different failure signature.
>> I've uploaded osd log with debug osd = 20, debug filestore = 20, and debug 
>> ms = 20.  You can find it in below links.  Let me know if there is preferred 
>> way to share this!
>> https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc 
>> (ceph-osd.3.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc 
>> (ceph-osd.4.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k 
>> (ceph-osd.5.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE 
>> (ceph-osd.8.log)
>> So how does this look?  Can this be fixed? =)  If so please let me know.   I 
>> used to take backups but since it grew so big, I wasn't able to do so 
>> anymore... and would like to get most of these back if I can.  Please let me 
>> know if you need more info!
>> Thank you!
>> Regards,
>> Hong
> 
> with only 2 osd host. how are you doing replication ? i assume you use 
> size=2, and that is somewhat ok, if you have min_size=2, but if you have 
> min_size=1 it can quickly become a big problem of lost objects.
> 
> with size=2, min_size=2 your data should be on 2 drives safely(if you can get 
> one of them running again), but your cluster will block when there is an 
> issue.
> 
> if at all possible i would add a third osd node in your cluster. so your OK 
> PG's can replicate to that and you can work on the down osd's without fear of 
> loosing additional working osd's
> 
> Also some of your logs contains lines like...
> 
> failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.3.asok': 
> (17) File exists
> 
> filestore(/var/lib/ceph/osd/ceph-3) lock_fsid failed to lock 
> /var/lib/ceph/osd/ceph-3/fsid, is another ceph-osd still running? (11) 
> Resource temporarily unavailable
> 
> 7faf16e23800 -1 osd.3 0 OSD::pre_init: object store 
> '/var/lib/ceph/osd/ceph-3' is currently in use. (Is ceph-osd already running?)
> 
> 7faf16e23800 -1  ** ERROR: osd pre_init failed: (16) Device or resource busy
> 
> 
> 
> This can indicate that you have a dead osd3 process keeping the resources 
> open, and preventing a new osd from starting.
> 
> check with   ps aux if you can see any ceph processes. If you do find 
> somthging relating to your down osds's you should try stopping it normally, 
> and if that fails. killing it manually. before trying to restart the osd.
> 
> also check dmesg if you have messages relating to faulty hardware or OOM 
> killer there. i have had experiences with the OOM killer where the osd node 
> became unreliable until i rebooted the machine.
> 
> 
> kind regards, and good luck
> Ronny Aasen
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Reply via email to