We went through a period of time where we were experiencing these daily...

cd to the PG directory on each OSD and do a find for "238e1f29.00000076024c"
(mentioned in your error message). This will likely return a file that has
a slash in the name, something like rbd\udata.
238e1f29.00000076024c_head_blah_1f...

hexdump -C the object (tab completing the name helps) and pipe the output
to a different location. Once you obtain the hexdumps, do a diff or cmp
against them and find which one is not like the others.

If the primary is not the outlier, perform the PG repair without worry. If
the primary is the outlier, you will need to stop the OSD, move the object
out of place, start it back up and then it will be okay to issue a PG
repair.

Other less common inconsistent PGs we see are differing object sizes (easy
to detect with a simple list of file size) and differing attributes ("attr
-l", but the error logs are usually precise in identifying the problematic
PG copy).

On Fri, Mar 17, 2017 at 8:16 AM, Shain Miley <smi...@npr.org> wrote:

> Hello,
>
> Ceph status is showing:
>
> 1 pgs inconsistent
> 1 scrub errors
> 1 active+clean+inconsistent
>
> I located the error messages in the logfile after querying the pg in
> question:
>
> root@hqosd3:/var/log/ceph# zgrep -Hn 'ERR' ceph-osd.32.log.1.gz
>
> ceph-osd.32.log.1.gz:846:2017-03-17 02:25:20.281608 7f7744d7f700 -1
> log_channel(cluster) log [ERR] : 3.2b8 shard 32: soid
> 3/4650a2b8/rb.0.fe307e.238e1f29.00000076024c/head candidate had a read
> error, data_digest 0x84c33490 != known data_digest 0x974a24a7 from auth
> shard 62
>
>
> ceph-osd.32.log.1.gz:847:2017-03-17 02:30:40.264219 7f7744d7f700 -1
> log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 0 missing, 1 inconsistent
> objects
>
> ceph-osd.32.log.1.gz:848:2017-03-17 02:30:40.264307 7f7744d7f700 -1
> log_channel(cluster) log [ERR] : 3.2b8 deep-scrub 1 errors
>
> Is this a case where it would be safe to use 'ceph pg repair'?
> The documentation indicates there are times where running this command is
> less safe than others...and I would like to be sure before I do so.
>
> Thanks,
> Shain
>
>
> --
> NPR | Shain Miley | Manager of Infrastructure, Digital Media | smi...@npr.org 
> | 202.513.3649 <(202)%20513-3649>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
Brian Andrus | Cloud Systems Engineer | DreamHost
brian.and...@dreamhost.com | www.dreamhost.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to