On Mon, Jun 16, 2014 at 11:16 AM, Gregory Farnum <g...@inktank.com> wrote:
> On Mon, Jun 16, 2014 at 11:11 AM, Aaron Ten Clay <aaro...@aarontc.com> > wrote: > > I would also like to see Ceph get smarter about inconsistent PGs. If we > > can't automate the repair, at least the "ceph pg repair" command should > > figure out which copy is correct and use that, instead of overwriting all > > OSDs with whatever the primary has. > > > > Is it impossible to get the expected CRC out of Ceph so I can detect > which > > object is correct, instead of looking at the contents or comparing copies > > from multiple OSDs? > > The CRCs are pretty unlikely to happen (for replicated pools) until > there's kernel support for end-to-end data integrity. I imagine that > our next step will be a vote-for-correctness system, but it needs to > be designed and up until nowish there just haven't been enough people > running the software and getting inconsistent PGs for it to be a pain > point. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > Hmm, it seems to happen about once a month or so on my cluster, another PG will go inconsistent. I'm not surprised, really, since I interpret this as bitrot detection and the read failure rate on typical 4TB SATA disks is about on par with this for 24/7 activity. I'm surprised more people aren't encountering this and needing to resolve inconsistent PGs. In terms of CRCs - it's my understanding that OSDs detect bad objects by comparing the on-disk file's CRC with the stored CRC. Is this incorrect? If not, that is what I'd like to do manually so I can detect which object(s) are correct. -Aaron
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com