Thank you all for suggestions!
Maged,
I'll see what I can do on that... Looks like I may have to add another OSD host
as I utilized all of the SATA ports on those boards. =P
Ronny,
I am running with size=2 min_size=1. I created everything with ceph-deploy and
didn't touch much of that pool settings... I hope not, but sounds like I may
have lost some files! I do want some of those OSDs to come back online
somehow... to get that confidence level up. =P
The dead osd.3 message is probably me trying to stop and start the osd. There
were some cases where stop didn't kill the ceph-osd process. I just started or
restarted osd to try and see if that worked.. After that, there were some
reboots and I am not seeing those messages after it...
Tomasz,
This is something I am running at home. I am the only user. In a way it is
production environment but just driven by me. =)
Do you have any suggestions to get any of those osd.3, osd.4, osd.5, and osd.8
come back up without removing them? I have a feeling I can get some data back
with some of them intact.
Thank you!
Regards,Hong
On Monday, August 28, 2017 6:09 AM, Tomasz Kusmierz
<[email protected]> wrote:
Personally I would suggest to:
- change minimal replication type to OSD (from default host)
- remove the OSD from the host with all those "down OSD’s" (note that they are
down not out which makes it more weird)
- let single node cluster stabilise, yes performance will suck but at least you
will have data on two copies on singular host … better this than nothing.
- fix whatever issues you have on host OSD2
- add all osd on OSD2 and mark all osd from OSD1 with weight 0 - this will make
ceph migrate all data away from host OSD1
- fix all the problem you’ve got on host OSD1
reason I suggest that is that is seems that you’ve got issues everywhere and
since you are running a production environment (at least it seem like that to
me) data and down time is main priority.
> On 28 Aug 2017, at 11:58, Ronny Aasen <[email protected]> wrote:
>
> On 28. aug. 2017 08:01, hjcho616 wrote:
>> Hello!
>> I've been using ceph for long time mostly for network CephFS storage, even
>> before Argonaut release! It's been working very well for me. Yes, I had
>> some power outtages before and asked few questions on this list before and
>> got resolved happily! Thank you all!
>> Not sure why but we've been having quite a bit of power outages lately.
>> Ceph appear to be running OK with those going on.. so I was pretty happy and
>> didn't thought much of it... till yesterday, When I started to move some
>> videos to cephfs, ceph decided that it was full although df showed only 54%
>> utilization! Then I looked up, some of the osds were down! (only 3 at that
>> point!)
>> I am running pretty simple ceph configuration... I have one machine running
>> MDS and mon named MDS1. Two OSD machines with 5 2TB HDDs and 1 SSD for
>> journal named OSD1 and OSD2.
>> At the time, I was running jewel 10.2.2. I looked at some of downed OSD's
>> log file and googled some of them... they appeared to be tied to version
>> 10.2.2. So I just upgraded all to 10.2.9. Well that didn't solve my
>> problems.. =P While looking at some of this.. there was another power
>> outage! D'oh! I may need to invest in a UPS or something... Until this
>> happened, all of the osd down were from OSD2. But OSD1 took a hit!
>> Couldn't boot, because osd-0 was damaged... I tried xfs_repair -L /dev/sdb1
>> as suggested by command line.. I was able to mount it again, phew, reboot...
>> then /dev/sdb1 is no longer accessible! Noooo!!!
>> So this is what I have today! I am a bit concerned as half of the osds are
>> down! and osd.0 doesn't look good at all...
>> # ceph osd tree
>> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 16.24478 root default
>> -2 8.12239 host OSD1
>> 1 1.95250 osd.1 up 1.00000 1.00000
>> 0 1.95250 osd.0 down 0 1.00000
>> 7 0.31239 osd.7 up 1.00000 1.00000
>> 6 1.95250 osd.6 up 1.00000 1.00000
>> 2 1.95250 osd.2 up 1.00000 1.00000
>> -3 8.12239 host OSD2
>> 3 1.95250 osd.3 down 0 1.00000
>> 4 1.95250 osd.4 down 0 1.00000
>> 5 1.95250 osd.5 down 0 1.00000
>> 8 1.95250 osd.8 down 0 1.00000
>> 9 0.31239 osd.9 up 1.00000 1.00000
>> This looked alot better before that last extra power outage... =( Can't
>> mount it anymore!
>> # ceph health
>> HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 44 pgs
>> backfill_toofull; 80 pgs backfill_wait; 122 pgs degraded; 6 pgs down; 8 pgs
>> inconsistent; 6 pgs peering; 2 pgs recovering; 18 pgs recovery_wait; 16 pgs
>> stale; 122 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 159
>> pgs stuck unclean; 102 pgs stuck undersized; 102 pgs undersized; 1 requests
>> are blocked > 32 sec; recovery 1803466/4503980 objects degraded (40.042%);
>> recovery 692976/4503980 objects misplaced (15.386%); recovery 147/2251990
>> unfound (0.007%); 1 near full osd(s); 54 scrub errors; mds cluster is
>> degraded; no legacy OSD present but 'sortbitwise' flag is not set
>> Each of osds are showing different failure signature.
>> I've uploaded osd log with debug osd = 20, debug filestore = 20, and debug
>> ms = 20. You can find it in below links. Let me know if there is preferred
>> way to share this!
>> https://drive.google.com/open?id=0By7YztAJNGUWQXItNzVMR281Snc
>> (ceph-osd.3.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWYmJBb3RvLVdSQWc
>> (ceph-osd.4.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWaXhRMlFOajN6M1k
>> (ceph-osd.5.log)
>> https://drive.google.com/open?id=0By7YztAJNGUWdm9BWFM5a3ExOFE
>> (ceph-osd.8.log)
>> So how does this look? Can this be fixed? =) If so please let me know. I
>> used to take backups but since it grew so big, I wasn't able to do so
>> anymore... and would like to get most of these back if I can. Please let me
>> know if you need more info!
>> Thank you!
>> Regards,
>> Hong
>
> with only 2 osd host. how are you doing replication ? i assume you use
> size=2, and that is somewhat ok, if you have min_size=2, but if you have
> min_size=1 it can quickly become a big problem of lost objects.
>
> with size=2, min_size=2 your data should be on 2 drives safely(if you can get
> one of them running again), but your cluster will block when there is an
> issue.
>
> if at all possible i would add a third osd node in your cluster. so your OK
> PG's can replicate to that and you can work on the down osd's without fear of
> loosing additional working osd's
>
> Also some of your logs contains lines like...
>
> failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.3.asok':
> (17) File exists
>
> filestore(/var/lib/ceph/osd/ceph-3) lock_fsid failed to lock
> /var/lib/ceph/osd/ceph-3/fsid, is another ceph-osd still running? (11)
> Resource temporarily unavailable
>
> 7faf16e23800 -1 osd.3 0 OSD::pre_init: object store
> '/var/lib/ceph/osd/ceph-3' is currently in use. (Is ceph-osd already running?)
>
> 7faf16e23800 -1 ** ERROR: osd pre_init failed: (16) Device or resource busy
>
>
>
> This can indicate that you have a dead osd3 process keeping the resources
> open, and preventing a new osd from starting.
>
> check with ps aux if you can see any ceph processes. If you do find
> somthging relating to your down osds's you should try stopping it normally,
> and if that fails. killing it manually. before trying to restart the osd.
>
> also check dmesg if you have messages relating to faulty hardware or OOM
> killer there. i have had experiences with the OOM killer where the osd node
> became unreliable until i rebooted the machine.
>
>
> kind regards, and good luck
> Ronny Aasen
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com