Maged, on second host he has 4 out of 5 OSD failed on him … I think he’s past the trying to increase the backfill threshold :) ofcourse he could try to degrade cluster by letting mirror within same host :) > On 29 Aug 2017, at 21:26, Maged Mokhtar <mmokh...@petasan.org> wrote: > > One of the things to watch out in small clusters is OSDs can get full rather > unexpectedly in recovery/backfill cases: > > In your case you have 2 OSD nodes with 5 disks each. Since you have a replica > of 2, each PG will have 1 copy on each host, so if an OSD fails, all its PGs > will have to be re-created on the same host, meaning they will be distributed > only among the 4 OSDs on the same host, which will quickly bump their usage > by nearly 20% each. > the default osd_backfill_full_ratio is 85% so if any of the 4 OSDs was near > 70% util before the failure, it will easily reach 85% and cause the cluster > to error with backfill_toofull message you see. This is why i suggest you > add an extra disk or try your luck reasing osd_backfill_full_ratio to 92% it > may fix things. > > /Maged > > On 2017-08-29 21:13, hjcho616 wrote: > >> Nice! Thank you for the explanation! I feel like I can revive that OSD. =) >> That does sound great. I don't quite have another cluster so waiting for a >> drive to arrive! =) >> >> After setting min and max_min to 1, looks like toofull flag is gone... Maybe >> when I was making that video copy OSDs were already down... and those two >> OSDs were not enough to take too much extra... and on top of it that last >> OSD alive was smaller disk (2TB vs 320GB)... so it probably was filling up >> faster. I should have captured that message... but turned machine off and >> now I am at work. =P When I get back home, I'll try to grab that and share. >> Maybe I don't need to try to add another OSD to that cluster just yet! >> OSDs are about 50% full on OSD1. >> >> So next up, fixing osd0! >> >> Regards, >> Hong >> >> >> On Tuesday, August 29, 2017 1:05 PM, David Turner <drakonst...@gmail.com> >> wrote: >> >> >> But it was absolutely awesome to run an osd off of an rbd after the disk >> failed. >> >> On Tue, Aug 29, 2017, 1:42 PM David Turner <drakonst...@gmail.com >> <mailto:drakonst...@gmail.com>> wrote: >> To addend Steve's success, the rbd was created in a second cluster in the >> same datacenter so it didn't run the risk of deadlocking that mapping rbds >> on machines running osds has. It is still theoretical to work on the same >> cluster, but more inherently dangerous for a few reasons. >> >> On Tue, Aug 29, 2017, 1:15 PM Steve Taylor <steve.tay...@storagecraft.com >> <mailto:steve.tay...@storagecraft.com>> wrote: >> Hong, >> >> Probably your best chance at recovering any data without special, >> expensive, forensic procedures is to perform a dd from /dev/sdb to >> somewhere else large enough to hold a full disk image and attempt to >> repair that. You'll want to use 'conv=noerror' with your dd command >> since your disk is failing. Then you could either re-attach the OSD >> from the new source or attempt to retrieve objects from the filestore >> on it. >> >> I have actually done this before by creating an RBD that matches the >> disk size, performing the dd, running xfs_repair, and eventually >> adding it back to the cluster as an OSD. RBDs as OSDs is certainly a >> temporary arrangement for repair only, but I'm happy to report that it >> worked flawlessly in my case. I was able to weight the OSD to 0, >> offload all of its data, then remove it for a full recovery, at which >> point I just deleted the RBD. >> >> The possibilities afforded by Ceph inception are endless. ☺ >> >> >> >> Steve Taylor | Senior Software Engineer | StorageCraft Technology Corporation >> 380 Data Drive Suite 300 | Draper | Utah | 84020 >> Office: 801.871.2799 | >> >> If you are not the intended recipient of this message or received it >> erroneously, please notify the sender and delete it, together with any >> attachments, and be advised that any dissemination or copying of this >> message is prohibited. >> >> >> >> On Mon, 2017-08-28 at 23:17 +0100, Tomasz Kusmierz wrote: >> > Rule of thumb with batteries is: >> > - more "proper temperature" you run them at the more life you get out >> > of them >> > - more battery is overpowered for your application the longer it will >> > survive. >> > >> > Get your self a LSI 94** controller and use it as HBA and you will be >> > fine. but get MORE DRIVES !!!!! ... >> > > On 28 Aug 2017, at 23:10, hjcho616 <hjcho...@yahoo.com >> > > <mailto:hjcho...@yahoo.com>> wrote: >> > > >> > > Thank you Tomasz and Ronny. I'll have to order some hdd soon and >> > > try these out. Car battery idea is nice! I may try that.. =) Do >> > > they last longer? Ones that fit the UPS original battery spec >> > > didn't last very long... part of the reason why I gave up on them.. >> > > =P My wife probably won't like the idea of car battery hanging out >> > > though ha! >> > > >> > > The OSD1 (one with mostly ok OSDs, except that smart failure) >> > > motherboard doesn't have any additional SATA connectors available. >> > > Would it be safe to add another OSD host? >> > > >> > > Regards, >> > > Hong >> > > >> > > >> > > >> > > On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz <tom.kusmierz@g >> > > mail.com <http://mail.com/>> wrote: >> > > >> > > >> > > Sorry for being brutal ... anyway >> > > 1. get the battery for UPS ( a car battery will do as well, I've >> > > moded on ups in the past with truck battery and it was working like >> > > a charm :D ) >> > > 2. get spare drives and put those in because your cluster CAN NOT >> > > get out of error due to lack of space >> > > 3. Follow advice of Ronny Aasen on hot to recover data from hard >> > > drives >> > > 4 get cooling to drives or you will loose more ! >> > > >> > > >> > > > On 28 Aug 2017, at 22:39, hjcho616 <hjcho...@yahoo.com >> > > > <mailto:hjcho...@yahoo.com>> wrote: >> > > > >> > > > Tomasz, >> > > > >> > > > Those machines are behind a surge protector. Doesn't appear to >> > > > be a good one! I do have a UPS... but it is my fault... no >> > > > battery. Power was pretty reliable for a while... and UPS was >> > > > just beeping every chance it had, disrupting some sleep.. =P So >> > > > running on surge protector only. I am running this in home >> > > > environment. So far, HDD failures have been very rare for this >> > > > environment. =) It just doesn't get loaded as much! I am not >> > > > sure what to expect, seeing that "unfound" and just a feeling of >> > > > possibility of maybe getting OSD back made me excited about it. >> > > > =) Thanks for letting me know what should be the priority. I >> > > > just lack experience and knowledge in this. =) Please do continue >> > > > to guide me though this. >> > > > >> > > > Thank you for the decode of that smart messages! I do agree that >> > > > looks like it is on its way out. I would like to know how to get >> > > > good portion of it back if possible. =) >> > > > >> > > > I think I just set the size and min_size to 1. >> > > > # ceph osd lspools >> > > > 0 data,1 metadata,2 rbd, >> > > > # ceph osd pool set rbd size 1 >> > > > set pool 2 size to 1 >> > > > # ceph osd pool set rbd min_size 1 >> > > > set pool 2 min_size to 1 >> > > > >> > > > Seems to be doing some backfilling work. >> > > > >> > > > # ceph health >> > > > HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 >> > > > pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; >> > > > 108 pgs degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; >> > > > 7 pgs recovery_wait; 16 pgs stale; 108 pgs stuck degraded; 6 pgs >> > > > stuck inactive; 16 pgs stuck stale; 130 pgs stuck unclean; 101 >> > > > pgs stuck undersized; 101 pgs undersized; 1 requests are blocked >> > > > > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); >> > > > recovery 641906/4502340 objects misplaced (14.257%); recovery >> > > > 147/2251990 unfound (0.007%); 50 scrub errors; mds cluster is >> > > > degraded; no legacy OSD present but 'sortbitwise' flag is not set >> > > > >> > > > >> > > > >> > > > Regards, >> > > > Hong >> > > > >> > > > >> > > > On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz <tom.kusmierz >> > > > @gmail.com <http://gmail.com/>> wrote: >> > > > >> > > > >> > > > So to decode few things about your disk: >> > > > >> > > > 1 Raw_Read_Error_Rate 0x002f 100 100 051 Pre-fail >> > > > Always - 37 >> > > > 37 read erros and only one sector marked as pending - fun disk >> > > > :/ >> > > > >> > > > 181 Program_Fail_Cnt_Total 0x0022 099 099 000 Old_age >> > > > Always - 35325174 >> > > > So firmware has quite few bugs, that's nice >> > > > >> > > > 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age >> > > > Always - 2855 >> > > > disk was thrown around while operational even more nice. >> > > > >> > > > 194 Temperature_Celsius 0x0002 047 041 000 Old_age >> > > > Always - 53 (Min/Max 15/59) >> > > > if your disk passes 50 you should not consider using it, high >> > > > temperatures demagnetise plate layer and you will see more errors >> > > > in very near future. >> > > > >> > > > 197 Current_Pending_Sector 0x0032 100 100 000 Old_age >> > > > Always - 1 >> > > > as mentioned before :) >> > > > >> > > > 200 Multi_Zone_Error_Rate 0x002a 100 100 000 Old_age >> > > > Always - 4222 >> > > > your heads keep missing tracks ... bent ? I don't even know how to >> > > > comment here. >> > > > >> > > > >> > > > generally fun drive you've got there ... rescue as much as you can >> > > > and throw it away !!! >> > > > >> > > > >> > > >> > > >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com