On Mon, Jun 16, 2014 at 11:16 AM, Gregory Farnum <g...@inktank.com> wrote:

> On Mon, Jun 16, 2014 at 11:11 AM, Aaron Ten Clay <aaro...@aarontc.com>
> wrote:
> > I would also like to see Ceph get smarter about inconsistent PGs. If we
> > can't automate the repair, at least the "ceph pg repair" command should
> > figure out which copy is correct and use that, instead of overwriting all
> > OSDs with whatever the primary has.
> >
> > Is it impossible to get the expected CRC out of Ceph so I can detect
> which
> > object is correct, instead of looking at the contents or comparing copies
> > from multiple OSDs?
>
> The CRCs are pretty unlikely to happen (for replicated pools) until
> there's kernel support for end-to-end data integrity. I imagine that
> our next step will be a vote-for-correctness system, but it needs to
> be designed and up until nowish there just haven't been enough people
> running the software and getting inconsistent PGs for it to be a pain
> point.
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
>

Hmm, it seems to happen about once a month or so on my cluster, another PG
will go inconsistent. I'm not surprised, really, since I interpret this as
bitrot detection and the read failure rate on typical 4TB SATA disks is
about on par with this for 24/7 activity. I'm surprised more people aren't
encountering this and needing to resolve inconsistent PGs.

In terms of CRCs - it's my understanding that OSDs detect bad objects by
comparing the on-disk file's CRC with the stored CRC. Is this incorrect? If
not, that is what I'd like to do manually so I can detect which object(s)
are correct.

-Aaron
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to