Re: [ceph-users] Power outages!!! help!

hjcho616 Mon, 28 Aug 2017 15:11:03 -0700

Thank you Tomasz and Ronny.  I'll have to order some hdd soon and try these 
out.  Car battery idea is nice!  I may try that.. =)  Do they last longer?  
Ones that fit the UPS original battery spec didn't last very long... part of 
the reason why I gave up on them.. =P  My wife probably won't like the idea of 
car battery hanging out though ha!
The OSD1 (one with mostly ok OSDs, except that smart failure) motherboard 
doesn't have any additional SATA connectors available.  Would it be safe to add 
another OSD host?
Regards,Hong


    On Monday, August 28, 2017 4:43 PM, Tomasz Kusmierz 
<[email protected]> wrote:
 

 Sorry for being brutal … anyway 1. get the battery for UPS ( a car battery 
will do as well, I’ve moded on ups in the past with truck battery and it was 
working like a charm :D )2. get spare drives and put those in because your 
cluster CAN NOT get out of error due to lack of space3. Follow advice of Ronny 
Aasen on hot to recover data from hard drives 4 get cooling to drives or you 
will loose more ! 


On 28 Aug 2017, at 22:39, hjcho616 <[email protected]> wrote:
Tomasz,
Those machines are behind a surge protector.  Doesn't appear to be a good one!  
I do have a UPS... but it is my fault... no battery.  Power was pretty reliable 
for a while... and UPS was just beeping every chance it had, disrupting some 
sleep.. =P  So running on surge protector only.  I am running this in home 
environment.   So far, HDD failures have been very rare for this environment. 
=)  It just doesn't get loaded as much!  I am not sure what to expect, seeing 
that "unfound" and just a feeling of possibility of maybe getting OSD back made 
me excited about it. =) Thanks for letting me know what should be the priority. 
 I just lack experience and knowledge in this. =) Please do continue to guide 
me though this. 
Thank you for the decode of that smart messages!  I do agree that looks like it 
is on its way out.  I would like to know how to get good portion of it back if 
possible. =)
I think I just set the size and min_size to 1.# ceph osd lspools0 data,1 
metadata,2 rbd,# ceph osd pool set rbd size 1set pool 2 size to 1# ceph osd 
pool set rbd min_size 1set pool 2 min_size to 1
Seems to be doing some backfilling work.
# ceph healthHEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds; 2 
pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs 
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait; 
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck stale; 
130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1 requests 
are blocked > 32 sec; recovery 1790657/4502340 objects degraded (39.772%); 
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990 
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD 
present but 'sortbitwise' flag is not set


Regards,Hong 

    On Monday, August 28, 2017 4:18 PM, Tomasz Kusmierz 
<[email protected]> wrote:
 

 So to decode few things about your disk:

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -     
 37
37 read erros and only one sector marked as pending - fun disk :/ 

181 Program_Fail_Cnt_Total  0x0022  099  099  000    Old_age  Always      -     
 35325174
So firmware has quite few bugs, that’s nice

191 G-Sense_Error_Rate      0x0022  100  100  000    Old_age  Always      -     
 2855
disk was thrown around while operational even more nice.

194 Temperature_Celsius    0x0002  047  041  000    Old_age  Always      -      
53 (Min/Max 15/59)
if your disk passes 50 you should not consider using it, high temperatures 
demagnetise plate layer and you will see more errors in very near future.

197 Current_Pending_Sector  0x0032  100  100  000    Old_age  Always      -     
 1
as mentioned before :)

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      
4222
your heads keep missing tracks … bent ? I don’t even know how to comment here.


generally fun drive you’ve got there … rescue as much as you can and throw it 
away !!!

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Reply via email to