Re: [ceph-users] Power outages!!! help!

hjcho616 Wed, 20 Sep 2017 15:36:25 -0700

# rados list-inconsistent-pg 
data["0.0","0.5","0.a","0.e","0.1c","0.29","0.2c"]# rados list-inconsistent-pg 
metadata["1.d","1.3d"]# rados list-inconsistent-pg rbd["2.7"]# rados 
list-inconsistent-obj 0.0 --format=json-pretty
{    "epoch": 23112,    "inconsistents": []}# rados list-inconsistent-obj 0.5 
--format=json-pretty{    "epoch": 23078,    "inconsistents": []}# rados 
list-inconsistent-obj 0.a --format=json-pretty{    "epoch": 22954,    
"inconsistents": []}# rados list-inconsistent-obj 0.e --format=json-pretty{    
"epoch": 23068,    "inconsistents": []}# rados list-inconsistent-obj 0.1c 
--format=json-pretty{    "epoch": 22954,    "inconsistents": []}# rados 
list-inconsistent-obj 0.29 --format=json-pretty{    "epoch": 22974,    
"inconsistents": []}# rados list-inconsistent-obj 0.2c --format=json-pretty{    
"epoch": 23194,    "inconsistents": []}# rados list-inconsistent-obj 1.d 
--format=json-pretty{    "epoch": 23072,    "inconsistents": []}# rados 
list-inconsistent-obj 1.3d --format=json-pretty{    "epoch": 23221,    
"inconsistents": []}# rados list-inconsistent-obj 2.7 --format=json-pretty{    
"epoch": 23032,    "inconsistents": []}
Looks like not much information is there.  Could you elaborate on the items you 
mentioned in find the object?  How do I check metadata.  What are we looking 
for in md5sum? 
- find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad.


I tried that Ceph: manually repair object - Ceph methods on PG 2.7 
before..Tried 3 replica case, which would result in shard missing, regardless 
of which one I moved,  2 replica case, hmm... I guess I don't know how long is 
"wait a bit" is, I just turned it back on after a minute or so, just returns 
back to same inconsistent message.. =P  Are we looking for entire stopped OSD 
to map to different OSD and get 3 replica when running stopped OSD again?
Regards,Hong

 

    On Wednesday, September 20, 2017 4:47 PM, hjcho616 <hjcho...@yahoo.com> 
wrote:
 

 Thanks Ronny.  I'll try that inconsistent issue soon.  
I think the OSD drive that PG 1.28 is sitting on is still ok... just file 
corruption happened when power outage happened.. =P  As you suggested, cd 
/var/lib/ceph/osd/ceph-4/current/
tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz 1.28_*
cd /var/lib/ceph/osd/ceph-10/tmposd
mkdir currentchown ceph.ceph current/
cd current/tar --xattrs --preserve-permissions -zxvf 
/var/lib/ceph/osd/ceph-4/current/osd.4.tar.gz
systemctl start ceph-osd@8

I created an temp OSD like I did during import time.  Then set the crush 
reweight to 0.  I noticed current directory was missing. =P So created a 
current directory and copied content there.
Starting OSD doesn't appear to show any activity.  Is there any other file I 
need to copy over other than 1.28_head and 1.28_tail directories?
Regards,Hong 

    On Wednesday, September 20, 2017 4:04 PM, Ronny Aasen 
<ronny+ceph-us...@aasen.cx> wrote:
 

  i would only tar the pg you have missing objects from, trying to inject older 
objects when the pg is correct can not be good. 
 
 
 scrub errors is kind of the issue with only 2 replicas. when you have 2 
different objects. how to know witch one is correct and witch one is bad..
 and as you have read on 
http://ceph.com/geen-categorie/ceph-manually-repair-object/  and 
onhttp://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ 
you need to
 
 - find the pg      ::  rados list-inconsistent-pg [pool]
 - find the problem ::  rados list-inconsistent-obj 0.6 --format=json-pretty ; 
give you the object name  look for hints to what is the bad object 
 - find the object  :: manually check the objects, check the object metadata, 
run md5sum on them all and compare. check objects on the nonrunning osd's and 
compare there as well. anything to try to determine what object is ok and what 
is bad. 
 - fix the problem  :: assuming you find the bad object, stop the affected osd 
with the bad object, remove the object manually, restart osd. issue repair 
command.
 
 
 if the rados commands does not give you the info you need to do it all 
manually as on http://ceph.com/geen-categorie/ceph-manually-repair-object/
 
 good luck 
 Ronny Aasen
 
 On 20.09.2017 22:17, hjcho616 wrote:
  
  Thanks Ronny. 
  I decided to try to tar everything under current directory.  Is this correct 
command for it?  Is there any directory we do not want in the new drive?  
commit_op_seq, meta, nosnap, omap?  
  tar --xattrs --preserve-permissions -zcvf osd.4.tar.gz . 
  As far as inconsistent PGs... I am running in to these errors.  I tried 
moving one copy of pg to other location, but it just says moved shard is 
missing.  Tried setting 'noout ' and turn one of them down, seems to work on 
something but then back to same error.  Currently trying to move to different 
osd... making sure the drive is not faulty, got few of them.. but still 
persisting..  I've been kicking off ceph pg repair PG#, hoping it would fix 
them. =P  Any other suggestion? 
  2017-09-20 09:39:48.481400 7f163c5fa700  0 log_channel(cluster) log [INF] : 
0.29 repair starts 2017-09-20 09:47:37.384921 7f163c5fa700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 6: soid 
0:97126ead:::200014ce4c3.0000028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.0000028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od ffffffff alloc_hint [0 0]) 2017-09-20 09:47:37.384931 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97126ead:::200014ce4c3.0000028f:head data_digest 0x8f679a50 != data_digest 
0x979f2ed4 from auth oi 0:97126ead:::200014ce4c3.0000028f:head(19366'539375 
client.535319.1:2361163 dirty|data_digest|omap_digest s 4194304 uv 539375 dd 
979f2ed4 od ffffffff alloc_hint [0 0]) 2017-09-20 09:47:37.384936 7f163c5fa700 
-1 log_channel(cluster) log [ERR] : 0.29 soid 
0:97126ead:::200014ce4c3.0000028f:head: failed to pick suitable auth object 
2017-09-20 09:48:11.138566 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 shard 6: soid 0:97d5c15a:::100000101b4.00006892:head data_digest 
0xd65b4014 != data_digest 0xf41cfab8 from auth oi 
0:97d5c15a:::100000101b4.00006892:head(12962'65557 osd.4.0:42234 
dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od ffffffff 
alloc_hint [0 0]) 2017-09-20 09:48:11.138575 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 shard 7: soid 
0:97d5c15a:::100000101b4.00006892:head data_digest 0xd65b4014 != data_digest 
0xf41cfab8 from auth oi 0:97d5c15a:::100000101b4.00006892:head(12962'65557 
osd.4.0:42234 dirty|data_digest|omap_digest s 4194304 uv 776 dd f41cfab8 od 
ffffffff alloc_hint [0 0]) 2017-09-20 09:48:11.138581 7f1639df5700 -1 
log_channel(cluster) log [ERR] : 0.29 soid 
0:97d5c15a:::100000101b4.00006892:head: failed to pick suitable auth object 
2017-09-20 09:48:55.584022 7f1639df5700 -1 log_channel(cluster) log [ERR] : 
0.29 repair 4 errors, 0 fixed 
  Latest health...  HEALTH_ERR 1 pgs are stuck inactive for more than 300 
seconds; 1 pgs down; 1 pgs incomplete; 9 pgs inconsistent; 1 pgs repair; 1 pgs 
stuck inactive; 1 pgs stuck unclean; 68 scrub errors; mds rank 0 has failed; 
mds cluster is degraded; no legacy OSD present but 'sortbitwise' flag is not 
set 
  Regards, Hong  
  
  
 
      On Wednesday, September 20, 2017 11:53 AM, Ronny Aasen 
<ronny+ceph-us...@aasen.cx> wrote:
  
 
    On 20.09.2017 16:49, hjcho616 wrote:
  
  Anyone?  Can this page be saved?  If not what are my options? 
  Regards, Hong 
 
      On Saturday, September 16, 2017 1:55 AM, hjcho616 <hjcho...@yahoo.com> 
wrote:
  
 
     Looking better... working on scrubbing.. HEALTH_ERR 1 pgs are stuck 
inactive for more than 300 seconds; 1 pgs incomplete; 12 pgs inconsistent; 2 
pgs repair; 1 pgs stuck inactive; 1 pgs stuck unclean;  109 scrub errors; too 
few PGs per OSD (29 < min 30); mds rank 0 has failed; mds cluster is degraded; 
noout flag(s) set; no legacy OSD present but 'sortbitwise' flag is not set
  
  Now PG1.28.. looking at all old osds dead or alive.  Only one with DIR_* 
directory is in osd.4.   This appears to be metadata pool!  21M of metadata can 
be quite a bit of stuff.. so I would like to rescue this!  But I am not able to 
start this OSD.  exporting through ceph-objectstore-tool appears to crash.  
Even with --skip-journal-replay and --skip-mount-omap (different failure).  As 
I mentioned in earlier email, that exception thrown message is bogus... # 
ceph-objectstore-tool --op export --pgid 1.28  --data-path 
/var/lib/ceph/osd/ceph-4 --journal-path /var/lib/ceph/osd/ceph-4/journal --file 
~/1.28.export terminate called after throwing an instance of 
'std::domain_error' 
  
         
 [SNIP] 
  
                          What can I do to save that PG1.28?  Please let me 
know if you need more information.  So close!... =)    
  Regards, Hong  
           
 12 inconsistent and 109 scrub errors is something you should fix first of all. 
  also you can consider using the paid-services of many ceph support companies. 
that specialize in these kind of situations. 
  -- that beeing said, here are some suggestions...
  when it comes to lost object recovery you have come about as far as i have 
ever experienced. so everything after here is just assumptions and wild 
guesswork to what you can try.  I hope others shouts out if i tell you wildly 
wrong things. 
  if you have found date pg1.28 from the broken osd and have checked all other 
working and nonworking drives, for that pg. then you need to try and extract 
the pg from the broken drive. As always in recovery cases, take a dd clone of 
the drive and work from the cloned image. to avoid more damage to the drive, 
and to allow you to try multiple times.
  you should add a temporary injection drive large enough for that pg, and set 
its crush weight to 0 so it always drains. make sure it is up and registered 
properly in ceph. 
  the idea is to copy the pg manually from broken-osd to the injection drive, 
since the export/import fails.. making sure you get all xattrs included.  one 
can either copy the whole pg, or just the "missing" objects.  if there are few 
objects i would go for that, if there are many i would take the whole pg. you 
wont get data from leveldb. so i am not at all sure this would work. but worth 
a shot.
  - stop your injection osd, verify it is down and the proccess not running.
 - from the mountpoint of your broken-osd go into the current directory. and 
tar up the pg1.28 make sure you use -p and --xattrs when you create the 
archive. 
 - if tar errors out on unreadable files, just rm those (since you are working 
on a copy of your rescue image, you can allways try again)
 - copy the tar file to the injection drive and extract while sitting in the 
current directory (remember --xattrs)
 - set debug options on the injection drive in ceph.conf
 - start the injection drive, and follow along in the log file. hopefully it 
should scan, locate the pg, and replicate the pg1.28 objects off to the current 
primary drive for pg1.28. and since it have crush weight 0 it should drain out.
 - if that works, verify the injection drive is drained, stop it and remove it 
from ceph.  zap the drive. 
  
  this is all as i said guesstimates so your mileage may vary
 good luck Ronny Aasen 
   
  
  
  
  
  
     
 
      
 
  _______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Reply via email to