Thanks Sam, I'll take a look. Seems sensible enough and worth a shot.
We'll probably call it a day after this and flatten in, but I'm
wondering if it's possible some rbd devices may miss these pg's and
could be exportable? Will have a tinker!
On Wed, Mar 11, 2015 at 7:06 PM, Samuel Just wrote:
Sure thing, n.b. I increased pg count to see if it would help. Alas not. :)
Thanks again!
health_detail
https://gist.github.com/199bab6d3a9fe30fbcae
osd_dump
https://gist.github.com/499178c542fa08cc33bb
osd_tree
https://gist.github.com/02b62b2501cbd684f9b2
Random selected queries:
queries/0.19
For each of those pgs, you'll need to identify the pg copy you want to
be the winner and either
1) Remove all of the other ones using ceph-objectstore-tool and
hopefully the winner you left alone will allow the pg to recover and go
active.
2) Export the winner using ceph-objectstore-tool, use
c
I'd like to not have to null them if possible, there's nothing
outlandishly valuable, its more the time to reprovision (users have
stuff on there, mainly testing but I have a nasty feeling some users
won't have backed up their test instances). When you say complicated
and fragile, could you expand?
Ok, you lost all copies from an interval where the pgs went active. The
recovery from this is going to be complicated and fragile. Are the
pools valuable?
-Sam
On 03/11/2015 03:35 AM, joel.merr...@gmail.com wrote:
For clarity too, I've tried to drop the min_size before as suggested,
doesn't m
For clarity too, I've tried to drop the min_size before as suggested,
doesn't make a difference unfortunately
On Wed, Mar 11, 2015 at 9:50 AM, joel.merr...@gmail.com
wrote:
> Sure thing, n.b. I increased pg count to see if it would help. Alas not. :)
>
> Thanks again!
>
> health_detail
> https://
Yeah, get a ceph pg query on one of the stuck ones.
-Sam
On Tue, 2015-03-10 at 14:41 +, joel.merr...@gmail.com wrote:
> Stuck unclean and stuck inactive. I can fire up a full query and
> health dump somewhere useful if you want (full pg query info on ones
> listed in health detail, tree, osd d
Stuck unclean and stuck inactive. I can fire up a full query and
health dump somewhere useful if you want (full pg query info on ones
listed in health detail, tree, osd dump etc). There were blocked_by
operations that no longer exist after doing the OSD addition.
Side note, spent some time yesterd
What do you mean by "unblocked" but still "stuck"?
-Sam
On Mon, 2015-03-09 at 22:54 +, joel.merr...@gmail.com wrote:
> On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just wrote:
> > You'll probably have to recreate osds with the same ids (empty ones),
> > let them boot, stop them, and mark them lost.
On Mon, Mar 9, 2015 at 2:28 PM, Samuel Just wrote:
> You'll probably have to recreate osds with the same ids (empty ones),
> let them boot, stop them, and mark them lost. There is a feature in the
> tracker to improve this behavior: http://tracker.ceph.com/issues/10976
> -Sam
Thanks Sam, I've re
You'll probably have to recreate osds with the same ids (empty ones),
let them boot, stop them, and mark them lost. There is a feature in the
tracker to improve this behavior: http://tracker.ceph.com/issues/10976
-Sam
On Mon, 2015-03-09 at 12:24 +, joel.merr...@gmail.com wrote:
> Hi,
>
> I'm
Hi,
I'm trying to fix an issue within 0.93 on our internal cloud related
to incomplete pg's (yes, I realise the folly of having the dev release
- it's a not-so-test env now, so I need to recover this really). I'll
detail the current outage info;
72 initial (now 65) OSDs
6 nodes
* Update to 0.92
Hi,
I'm trying to fix an issue within 0.93 on our internal cloud related
to incomplete pg's (yes, I realise the folly of having the dev release
- it's a not-so-test env now, so I need to recover this really). I'll
detail the current outage info;
72 initial (now 65) OSDs
6 nodes
* Update to 0.92
13 matches
Mail list logo