Re: [ceph-users] Power outages!!! help!

hjcho616 Sun, 03 Sep 2017 23:21:57 -0700

Thank you Ronny.  I've added two OSDs to OSD2, 2TB each.  I hope that would be 
enough. =)  I've changed min_size and size to 2.  OSDs are busy balancing 
again.  I'll try those you recommended and will get back to you with more 
questions! =) 
# ceph osd treeID WEIGHT   TYPE NAME      UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 
19.87198 root default-2  8.12239     host OSD1 1  1.95250         osd.1       
up  1.00000          1.00000 0  1.95250         osd.0     down        0         
 1.00000 7  0.31239         osd.7       up  1.00000          1.00000 6  1.95250 
        osd.6       up  1.00000          1.00000 2  1.95250         osd.2       
up  1.00000          1.00000-3 11.74959     host OSD2 3  1.95250         osd.3  
   down        0          1.00000 4  1.95250         osd.4     down        0    
      1.00000 5  1.95250         osd.5     down        0          1.00000 8  
1.95250         osd.8     down        0          1.00000 9  0.31239         
osd.9       up  1.00000          1.0000010  1.81360         osd.10      up  
1.00000          1.0000011  1.81360         osd.11      up  1.00000          
1.00000
Regards,Hong

On Sunday, September 3, 2017 6:56 AM, Ronny Aasen
<[email protected]> wrote:

I would not even attempt to connect a recovered drive to ceph, especially not
one that have had xfs errors and corruption.

your pg's that are undersized lead me to belive you still need to either
expand, with more disks, or nodes. or that you need to set
osd crush chooseleaf type = 0
to let ceph pick 2 disks on the same node as a valid object placement.
(temporary until you get 2 balanced nodes) generally let ceph selfheal as much
as possible (no misplaced or degraded objects) this require that ceph have
space for the recovery.
i would run with size=2 min_size=2

you should also look at the 7 shrub errors. they indicate that there can be
other drives with issues, you want to locate where those inconsistent objects
are, and fix them. read this page about fixing scrub errors.
http://ceph.com/geen-categorie/ceph-manually-repair-object/

then you would sit with the 103 unfound objects, and those you should try to
recover from the recovered drive.
by using the ceph-objectstore-tool export/import to try and export pg's
missing objects to a dedicated temporary added import drive.
the import drive does not need to be very large. since you can do one and one
pg at the time. and you should only recover pg's that contain unfound objects.
there is realy only 103 unfound objects that you need to recover.
once the recovery is compleate you can wipe the functioning recovery drive,
and install it as a new osd to the cluster.

kind regards
Ronny Aasen

On 03.09.2017 06:20, hjcho616 wrote:

I checked with ceph-2, 3, 4, 5 so I figured it was safe to assume that
superblock file is the same. I copied it over and started OSD. It still fails
with the same error message. Looks like when I updated to 10.2.9, some osd
needs to be updated and that process is not finding the data it needs? What
can I do about this situation?
2017-09-01 22:27:35.590041 7f68837e5800 1
filestore(/var/lib/ceph/osd/ceph-0) upgrade 2017-09-01 22:27:35.590149
7f68837e5800 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory
Regards, Hong

On Friday, September 1, 2017 11:10 PM, hjcho616 <[email protected]>
wrote:

Just realized there is a file called superblock in the ceph directory.
ceph-1 and ceph-2's superblock file is identical, ceph-6 and ceph-7 are
identical, but not between the two groups. When I originally created the
OSDs, I created ceph-0 through 5. Can superblock file be copied over from
ceph-1 to ceph-0?
Hmm.. it appears to be doing something in the background even though osd.0 is
down. ceph health output is changing! # ceph health HEALTH_ERR 40 pgs are
stuck inactive for more than 300 seconds; 14 pgs backfill_wait; 21 pgs
degraded; 10 pgs down; 2 pgs inconsistent; 10 pgs peering; 3 pgs recovering; 2
pgs recovery_wait; 30 pgs stale; 21 pgs stuck degraded; 10 pgs stuck inactive;
30 pgs stuck stale; 45 pgs stuck unclean; 16 pgs stuck undersized; 16 pgs
undersized; 2 requests are blocked > 32 sec; recovery 221826/2473662 objects
degraded (8.968%); recovery 254711/2473662 objects misplaced (10.297%);
recovery 103/2251966 unfound (0.005%); 7 scrub errors; mds cluster is degraded;
no legacy OSD present but 'sortbitwise' flag is not set
Regards, Hong

On Friday, September 1, 2017 10:37 PM, hjcho616 <[email protected]>
wrote:

Tried connecting recovered osd. Looks like some of the files in the
lost+found are super blocks. Below is the log. What can I do about this?
2017-09-01 22:27:27.634228 7f68837e5800 0 set uid:gid to 1001:1001
(ceph:ceph) 2017-09-01 22:27:27.634245 7f68837e5800 0 ceph version 10.2.9
(2ee413f77150c0f375ff6f10edd6c8f9c7d060d0), process ceph-osd, pid 5432
2017-09-01 22:27:27.635456 7f68837e5800 0 pidfile_write: ignore empty
--pid-file 2017-09-01 22:27:27.646849 7f68837e5800
0filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) 2017-09-01
22:27:27.647077 7f68837e5800
0genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP
ioctl is disabled via 'filestore fiemap' config option 2017-09-01
22:27:27.647080 7f68837e5800
0genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2017-09-01 22:27:27.647091 7f68837e5800
0genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is
supported 2017-09-01 22:27:27.678937 7f68837e5800
0genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2)
syscall fully supported (by glibc and kernel) 2017-09-01 22:27:27.679044
7f68837e5800 0xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature:
extsize is disabled by conf 2017-09-01 22:27:27.680718 7f68837e5800 1 leveldb:
Recovering log #28054 2017-09-01 22:27:27.804501 7f68837e5800 1 leveldb:
Delete type=0 #28054
2017-09-01 22:27:27.804579 7f68837e5800 1 leveldb: Delete type=3 #28053
2017-09-01 22:27:35.586725 7f68837e5800 0filestore(/var/lib/ceph/osd/ceph-0)
mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2017-09-01
22:27:35.587689 7f68837e5800 1 journal _open /var/lib/ceph/osd/ceph-0/journal
fd 18: 9998729216 bytes, block size 4096 bytes, directio = 1, aio = 1
2017-09-01 22:27:35.589631 7f68837e5800 1 journal _open
/var/lib/ceph/osd/ceph-0/journal fd 18: 9998729216 bytes, block size 4096
bytes, directio = 1, aio = 1 2017-09-01 22:27:35.590041 7f68837e5800
1filestore(/var/lib/ceph/osd/ceph-0) upgrade 2017-09-01 22:27:35.590149
7f68837e5800 -1filestore(/var/lib/ceph/osd/ceph-0) could not find
#-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory
2017-09-01 22:27:35.590158 7f68837e5800 -1 osd.0 0 OSD::init() : unable to
read osd superblock 2017-09-01 22:27:35.590547 7f68837e5800 1 journal close
/var/lib/ceph/osd/ceph-0/journal 2017-09-01 22:27:35.611595 7f68837e5800 -1
^[[0;31m ** ERROR: osd init failed: (22) Invalid argument^[[0m
Recovered drive is mounted on /var/lib/ceph/osd/ceph-0. # df Filesystem
1K-blocks Used Available Use% Mounted on udev 10240
0 10240 0% /dev tmpfs 1584780 9172 1575608 1%
/run /dev/sda1 15247760 9319048 5131120 65% / tmpfs
3961940 0 3961940 0% /dev/shm tmpfs 5120 0
5120 0% /run/lock tmpfs 3961940 0 3961940 0%
/sys/fs/cgroup /dev/sdb1 1952559676 634913968 1317645708 33%
/var/lib/ceph/osd/ceph-0 /dev/sde1 1952559676 640365952 1312193724 33%
/var/lib/ceph/osd/ceph-6 /dev/sdd1 1952559676 712018768 1240540908 37%
/var/lib/ceph/osd/ceph-2 /dev/sdc1 1952559676 755827440 1196732236 39%
/var/lib/ceph/osd/ceph-1 /dev/sdf1 312417560 42538060 269879500 14%
/var/lib/ceph/osd/ceph-7 tmpfs 792392 0 792392 0%
/run/user/0 # cd /var/lib/ceph/osd/ceph-0 # ls activate.monmap current
journal_uuid magic superblock whoami active fsid
keyring ready sysvinit ceph_fsid journal lost+found
store_version type
Regards, Hong

On Friday, September 1, 2017 2:59 PM, hjcho616 <[email protected]>
wrote:

Found the partition, wasn't able to mount the partition right away... Did
a xfs_repair on that drive.
Got bunch of messages like this.. =(
entry"100000a89fd.00000000__head_AE319A25__0" in shortform directory 845908970
references non-existent inode 605294241 junking entry
"100000a89fd.00000000__head_AE319A25__0" in directory inode 845908970
Was able to mount. lost+found has lots of files there. =P Running du seems
to show OK files in current directory.
Will it be safe to attach this one back to the cluster? Is there a way to
specify to use this drive if the data is missing? =) Or am I being paranoid?
Just plug it? =)
Regards, Hong

On Friday, September 1, 2017 9:01 AM, hjcho616 <[email protected]>
wrote:

Looks like it has been rescued... Only 1 error as we saw before in the
smart log! # ddrescue -f /dev/sda /dev/sdc ./rescue.log GNU ddrescue 1.21
Press Ctrl-C to interrupt ipos: 1508 GB, non-trimmed: 0 B,
current rate: 0 B/s opos: 1508 GB, non-scraped: 0 B,
average rate: 88985 kB/s non-tried: 0 B, errsize: 4096 B,
run time: 6h 14m 40s rescued: 2000 GB, errors: 1, remaining
time: n/a percent rescued: 99.99% time since last successful
read: 39s Finished
Still missing partition in the new drive. =P I found this util called
testdisk for broken partition tables. Will try that tonight. =P
Regards, Hong

On Wednesday, August 30, 2017 9:18 AM, Ronny Aasen
<[email protected]> wrote:

On 30.08.2017 15:32, Steve Taylor wrote:

I'm not familiar with dd_rescue, but I've just been reading about it. I'm
not seeing any features that would be beneficial in this scenario that aren't
also available in dd. What specific features give it "really a far better
chance of restoring a copy of your disk" than dd? I'm always interested in
learning about new recovery tools.
i see i wrote dd_rescue from old habit, but the package one should use on
debian is gddrescue or also called gnu ddrecue.

this page have some details on the differences on dd vs the ddrescue
variants.
http://www.toad.com/gnu/sysadmin/index.html#ddrescue

kind regards
Ronny Aasen

| If you are not the intended recipient of this message or received it
erroneously, please notify the sender and delete it, together with any
attachments, and be advised that any dissemination or copying of this message
is prohibited. |

On Tue, 2017-08-29 at 21:49 +0200, Willem Jan Withagen wrote:
On 29-8-2017 19:12, Steve Taylor wrote:

Hong, Probably your best chance at recovering any data without special,
expensive, forensic procedures is to perform a dd from /dev/sdb to somewhere
else large enough to hold a full disk image and attempt to repair that. You'll
want to use 'conv=noerror' with your dd command since your disk is failing.
Then you could either re-attach the OSD from the new source or attempt to
retrieve objects from the filestore on it.
Like somebody else already pointed out In problem "cases like disk, use
dd_rescue. It has really a far better chance of restoring a copy of your disk
--WjW
I have actually done this before by creating an RBD that matches the disk
size, performing the dd, running xfs_repair, and eventually adding it back
to the cluster as an OSD. RBDs as OSDs is certainly a temporary arrangement
for repair only, but I'm happy to report that it worked flawlessly in my
case. I was able to weight the OSD to 0, offload all of its data, then remove
it for a full recovery, at which point I just deleted the RBD. The
possibilities afforded by Ceph inception are endless. ☺ Steve Taylor | Senior
Software Engineer | StorageCraft Technology Corporation 380 Data Drive Suite
300 | Draper | Utah | 84020 Office: 801.871.2799 | If you are not the intended
recipient of this message or received it erroneously, please notify the sender
and delete it, together with any attachments, and be advised that any
dissemination or copying of this message is prohibited. On Mon, 2017-08-28 at
23:17 +0100, Tomasz Kusmierz wrote:
Rule of thumb with batteries is: - more “proper temperature” you run them at
the more life you get out of them - more battery is overpowered for your
application the longer it will survive. Get your self a LSI 94** controller
and use it as HBA and you will be fine. but get MORE DRIVES !!!!! …
On 28 Aug 2017, at 23:10, hjcho616 <[email protected]> wrote: Thank you
Tomasz and Ronny. I'll have to order some hdd soon and try these out. Car
battery idea is nice! I may try that.. =) Do they last longer? Ones that fit
the UPS original battery spec didn't last very long... part of the reason why
I gave up on them.. =P My wife probably won't like the idea of car battery
hanging out though ha! The OSD1 (one with mostly ok OSDs, except that smart
failure) motherboard doesn't have any additional SATA connectors available.
Would it be safe to add another OSD host? Regards, Hong On Monday, August 28,
2017 4:43 PM, Tomasz Kusmierz <tom.kusmierz@g mail.com> wrote: Sorry for
being brutal … anyway 1. get the battery for UPS ( a car battery will do as
well, I’ve moded on ups in the past with truck battery and it was working
like a charm :D ) 2. get spare drives and put those in because your cluster
CAN NOT get out of error due to lack of space 3. Follow advice of Ronny
Aasen on hot to recover data from hard drives 4 get cooling to drives or you
will loose more !
On 28 Aug 2017, at 22:39, hjcho616 <[email protected]> wrote: Tomasz, Those
machines are behind a surge protector. Doesn't appear to be a good one! I
do have a UPS... but it is my fault... no battery. Power was pretty reliable
for a while... and UPS was just beeping every chance it had, disrupting some
sleep.. =P So running on surge protector only. I am running this in home
environment. So far, HDD failures have been very rare for this environment.
=) It just doesn't get loaded as much! I am not sure what to expect, seeing
that "unfound" and just a feeling of possibility of maybe getting OSD back
made me excited about it. =) Thanks for letting me know what should be the
priority. I just lack experience and knowledge in this. =) Please do
continue to guide me though this. Thank you for the decode of that smart
messages! I do agree that looks like it is on its way out. I would like to
know how to get good portion of it back if possible. =) I think I just set the
size and min_size to 1. # ceph osd lspools 0 data,1 metadata,2 rbd, # ceph
osd pool set rbd size 1 set pool 2 size to 1 # ceph osd pool set rbd
min_size 1 set pool 2 min_size to 1 Seems to be doing some backfilling work.
# ceph health HEALTH_ERR 22 pgs are stuck inactive for more than 300 seconds;
2 pgs backfill_toofull; 74 pgs backfill_wait; 3 pgs backfilling; 108 pgs
degraded; 6 pgs down; 6 pgs inconsistent; 6 pgs peering; 7 pgs recovery_wait;
16 pgs stale; 108 pgs stuck degraded; 6 pgs stuck inactive; 16 pgs stuck
stale; 130 pgs stuck unclean; 101 pgs stuck undersized; 101 pgs undersized; 1
requests are blocked
32 sec; recovery 1790657/4502340 objects degraded (39.772%);
recovery 641906/4502340 objects misplaced (14.257%); recovery 147/2251990
unfound (0.007%); 50 scrub errors; mds cluster is degraded; no legacy OSD
present but 'sortbitwise' flag is not set Regards, Hong On Monday, August 28,
2017 4:18 PM, Tomasz Kusmierz <tom.kusmierz @gmail.com> wrote: So to decode
few things about your disk: 1 Raw_Read_Error_Rate 0x002f 100 100 051
Pre-fail Always - 37 37 read erros and only one sector marked as
pending - fun disk :/ 181 Program_Fail_Cnt_Total 0x0022 099 099 000
Old_age Always - 35325174 So firmware has quite few bugs, that’s
nice 191 G-Sense_Error_Rate 0x0022 100 100 000 Old_age Always
- 2855 disk was thrown around while operational even more nice. 194
Temperature_Celsius 0x0002 047 041 000 Old_age Always - 53
(Min/Max 15/59) if your disk passes 50 you should not consider using it, high
temperatures demagnetise plate layer and you will see more errors in very
near future. 197 Current_Pending_Sector 0x0032 100 100 000 Old_age
Always - 1 as mentioned before :) 200 Multi_Zone_Error_Rate 0x002a
100 100 000 Old_age Always - 4222 your heads keep missing
tracks … bent ? I don’t even know how to comment here. generally fun drive
you’ve got there … rescue as much as you can and throw it away !!!

_______________________________________________ ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Power outages!!! help!

Reply via email to