I reformatted 2 OSDs, in a cluster with 2 replicas. I tried to get as
much data off them as possible before hand, using ceph osd out, but I
couldn't get it all.
I know I've lost data.
I have 1 incomplete PG, which is better than I expected. Following
previous advice, I ran
ceph pg force_create_pg 11.483
The PG switches to 'creating' for a while, then goes back to 'incomplete':
2014-04-12 12:20:22.356297 mon.0 [INF] pgmap v5602996: 2592 pgs: 2035
active+clean, 553 active+remapped+wait_backfill, 2 active+recovery_wait,
1 active+remapped+backfilling, 1 incomplete; 15086 GB data, 30576 GB
used, 29011 GB / 59588 GB avail; 4606075/41313663 objects degraded
(11.149%); 24965 kB/s, 34 objects/s recovering
2014-04-12 12:20:25.737277 mon.0 [INF] pgmap v5602997: 2592 pgs: 1
creating, 2035 active+clean, 553 active+remapped+wait_backfill, 2
active+recovery_wait, 1 active+remapped+backfilling; 15086 GB data,
30576 GB used, 29011 GB / 59588 GB avail; 4606075/41313663 objects
degraded (11.149%); 16179 kB/s, 22 objects/s recovering
<snip>
2014-04-12 12:21:29.141144 osd.3 [WRN] 3 slow requests, 1 included
below; oldest blocked for > 444.032652 secs
2014-04-12 12:21:29.141148 osd.3 [WRN] slow request 30.377846 seconds
old, received at 2014-04-12 12:20:58.763265: osd_op(client.57449388.0:1
.dir.us-west-1.51941060.1 [delete] 11.7c96a483 e28552) v4 currently
reached pg
<snip>
2014-04-12 12:23:33.160096 mon.0 [INF] osdmap e28553: 16 osds: 16 up, 16 in
2014-04-12 12:23:33.197448 mon.0 [INF] pgmap v5603063: 2592 pgs: 1
creating, 2037 active+clean, 552 active+remapped+wait_backfill, 2
active+remapped+backfilling; 15086 GB data, 30584 GB used, 29003 GB /
59588 GB avail; 4597857/41313663 objects degraded (11.129%); 26137 kB/s,
28 objects/s recovering
2014-04-12 12:23:34.196847 mon.0 [INF] osdmap e28554: 16 osds: 16 up, 16 in
2014-04-12 12:23:34.224192 mon.0 [INF] pgmap v5603064: 2592 pgs: 2037
active+clean, 552 active+remapped+wait_backfill, 2
active+remapped+backfilling, 1 incomplete; 15086 GB data, 30585 GB used,
29002 GB / 59588 GB avail; 4597857/41313663 objects degraded (11.129%)
The blocked object is on the incomplete PG.
PG query is 2.3MiB: https://cd.centraldesktop.com/p/eAAAAAAADSsLAAAAAH2kja0
The query is from after the PG switched back to incomplete.
I'm running 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60).
How can I get this PG clean again?
Once it's clean, is there a RGW fsck/scrub I can run?
Any advice is appreciated.
--
*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>
*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter
<http://www.twitter.com/centraldesktop> | Facebook
<http://www.facebook.com/CentralDesktop> | LinkedIn
<http://www.linkedin.com/groups?gid=147417> | Blog
<http://cdblog.centraldesktop.com/>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com