On Fri, Apr 29, 2016 at 9:34 AM, Mike Lovell <mike.lov...@endurance.com>
wrote:

> On Fri, Apr 29, 2016 at 5:54 AM, Alexey Sheplyakov <
> asheplya...@mirantis.com> wrote:
>
>> Hi,
>>
>> > i also wonder if just taking 148 out of the cluster (probably just
>> marking it out) would help
>>
>> As far as I understand this can only harm your data. The acting set of PG
>> 17.73 is  [41, 148],
>> so after stopping/taking out OSD 148  OSD 41 will store the only copy of
>> objects in PG 17.73
>> (so it won't accept writes any more).
>>
>> > since there are other osds in the up set (140 and 5)
>>
>> These OSDs are not in the acting set, they have no (at least some of the)
>> objects from PG 17.73,
>> and are copying the missing objects from OSDs 41 and 148. Naturally this
>> slows down or even
>> blocks writes to PG 17.73.
>>
>
> k. i didn't know if it could just use the members of the up set that are
> not in the acting set for completing writes. when thinking through it in my
> head it seemed reasonable but i could also see pitfalls with doing it.
> thats why i was asking if it was possible.
>
>
> > the only thing holding things together right now is a while loop doing
>> an 'ceph osd down 41' every minute
>>
>> As far as I understand this disturbs the backfilling and further delays
>> writes to that poor PG.
>>
>
> it definitely does seem to have an impact similar to that. the only upside
> is that it clears the slow io messages though i don't know if it actually
> lets the client io complete. recovery doesn't make any progress though in
> between the down commands. its not making any progress on its own anyways.
>

i went to check things this morning and noticed that the number of objects
misplaced had dropped from what i was expecting and was occasionally seeing
lines from ceph -w saying a number of objects were recovering. the only PG
in a state other than active+clean was the one that 41 and 148 were
bickering about so it looks like they were now passing traffic. it appeared
to start just after on of the osd down events that was happening in the
loop i had running. a little while after the backfill started making
progress, it completed. so its fine now. i would still like to try and find
out the cause since this has happened twice now. but at least its not an
emergency for me at the moment.

one other thing that was odd was that i saw the misplaced objects go
negative during the backfill. this is one of the lines from ceph -w.

2016-04-29 10:38:15.011241 mon.0 [INF] pgmap v27055697: 6144 pgs: 6143
active+clean, 1 active+undersized+degraded+remapped+backfilling; 123 TB
data, 372 TB used, 304 TB / 691 TB avail; 130 MB/s rd, 135 MB/s wr, 11210
op/s; 14547/93845634 objects degraded (0.016%); -13959/93845634 objects
misplaced (-0.015%); 27358 kB/s, 7 objects/s recovering

it seemed to complete around the point where it got to -14.5k misplaced.
i'm guessing this is just a reporting error but i immediately started a
deep-scrub on the pg just to make sure things are consistent.

mike
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to