The OSD should have logged the identities of the inconsistent objects
to the central log on the monitors, as well as to its own local log
file. You'll need to identify for yourself which version is correct,
which will probably involve going and looking at them inside each
OSD's data store. If the primary is correct for all the objects in a
PG, you can just run repair; otherwise you'll want to copy the
replica's copy to the primary. Sorry. :/
(If you have no way of checking yourself which is correct, and you
have more than 2 replicas, you can compare the stored copies and just
take the one held by the majority — that's probably correct.)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Thu, Jun 12, 2014 at 7:27 PM, Aaron Ten Clay <aaro...@aarontc.com> wrote:
> I'm having trouble finding a concise set of steps to repair inconsistent
> placement groups. I know from other threads that issuing a 'ceph pg repair
> ...' command could cause loss of data integrity if the primary OSD happens
> to have the bad copy of the placement group. I know how to find which PG's
> are bad (ceph pg dump), but I'm not sure how to figure out which objects in
> the PG failed their CRCs during the deep scrub, and I'm not sure how to get
> the correct CRC so I can determine which OSD holds the correct copy.
>
> Maybe I'm on the wrong path entirely? If someone knows how to resolve this,
> I'd appreciate some insight. I think this would be a good topic for adding
> to the OSD/PG operations section of the manual, or at least a wiki article.
>
> Thanks!
> -Aaron
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to