Re: [ceph-users] unable to repair PG

Luis Periquito Fri, 12 Dec 2014 01:12:50 -0800

Hi Greg,

thanks for your help. It's always highly appreciated. :)


On Thu, Dec 11, 2014 at 6:41 PM, Gregory Farnum <g...@gregs42.com> wrote:

> On Thu, Dec 11, 2014 at 2:57 AM, Luis Periquito <periqu...@gmail.com>
> wrote:
> > Hi,
> >
> > I've stopped OSD.16, removed the PG from the local filesystem and started
> > the OSD again. After ceph rebuilt the PG in the removed OSD I ran a
> > deep-scrub and the PG is still inconsistent.
>
> What led you to remove it from osd 16? Is that the one hosting the log
> you snipped from? Is osd 16 the one hosting shard 6 of that PG, or was
> it the primary?
>
OSD 16 is both the primary for this PG and the one that has the snipped
log. The other 3 OSDs has any mention of this PG in their logs. Just some
messages about slow requests and the backfill when I removed the object.
Actually it came from OSD.6 - currently we don't have OSD.3.

this is the output of the pg dump for this PG
9.180    25614    0    0    0    23306482348    3001    3001
 active+clean+inconsistent    2014-12-10 17:29:01.937929    40242'1108124
 40242:23305321    [16,10,27,6]    16    [16,10,27,6]16    40242'1071363
 2014-12-10 17:29:01.937881    40242'1071363    2014-12-10 17:29:01.937881


> Anyway, the message means that shard 6 (which I think is the seventh
> OSD in the list) of PG 9.180 is missing a bunch of xattrs on object
> 370cbf80/29145.4_xxx/head//9. I'm actually a little surprised it
> didn't crash if it's missing the "_" attr....
> -Greg
>

Any idea on how to fix it?


>
> >
> > I'm running out of ideas on trying to solve this. Does this mean that all
> > copies of the object should also be inconsistent? Should I just try to
> > figure which object/bucket this belongs to and delete it/copy it again to
> > the ceph cluster?
> >
> > Also, do you know what the error message means? is it just some sort of
> > metadata for this object that isn't correct, not the object itself?
> >
> > On Wed, Dec 10, 2014 at 11:11 AM, Luis Periquito <periqu...@gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> In the last few days this PG (pool is .rgw.buckets) has been in error
> >> after running the scrub process.
> >>
> >> After getting the error, and trying to see what may be the issue (and
> >> finding none), I've just issued a ceph repair followed by a ceph
> deep-scrub.
> >> However it doesn't seem to have fixed the issue and it still remains.
> >>
> >> The relevant log from the OSD is as follows.
> >>
> >> 2014-12-10 09:38:09.348110 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
> 0
> >> missing, 1 inconsistent objects
> >> 2014-12-10 09:38:09.348116 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
> 1
> >> errors
> >> 2014-12-10 10:13:15.922065 7f8f618be700  0 log [INF] : 9.180 repair ok,
> 0
> >> fixed
> >> 2014-12-10 10:55:27.556358 7f8f618be700  0 log [ERR] : 9.180 shard 6:
> soid
> >> 370cbf80/29145.4_xxx/head//9 missing attr _, missing attr _user.rgw.acl,
> >> missing attr _user.rgw.content_type, missing attr _user.rgw.etag,
> missing
> >> attr _user.rgw.idtag, missing attr _user.rgw.manifest, missing attr
> >> _user.rgw.x-amz-meta-md5sum, missing attr _user.rgw.x-amz-meta-stat,
> missing
> >> attr snapset
> >> 2014-12-10 10:56:50.597952 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
> 0
> >> missing, 1 inconsistent objects
> >> 2014-12-10 10:56:50.597957 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
> 1
> >> errors
> >>
> >> I'm running version firefly 0.80.7.
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] unable to repair PG

Reply via email to