Re: [ceph-users] cephfs metadata damage and scrub error

2017-07-18 Thread Mazzystr
Any update to this? I also have the same problem # for i in $(cat pg_dump | grep 'stale+active+clean' | awk {'print $1'}); do echo -n "$i: "; rados list-inconsistent-obj $i; echo; done 107.ff: {"epoch":10762,"inconsistents":[]} . and so on for 49 pg's that I think I had a problem with #

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-30 Thread James Eckersall
Further to this, we managed to repair the inconsistent PG by comparing the object digests and removing the one that didn't match (3 of 4 replicas had the same digest, 1 didn't) and then issuing a pg repair and scrub. This has removed the inconsistent flag on the PG, however, we are still seeing the

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-17 Thread James Eckersall
An update to this. The cluster has been upgraded to Kraken, but I've still got the same PG reporting inconsistent and the same error message about mds metadata damaged. Can anyone offer any further advice please? If you need output from the ceph-osdomap-tool, could you please explain how to use it

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-03 Thread James Eckersall
Hi David, Thanks for the reply, it's appreciated. We're going to upgrade the cluster to Kraken and see if that fixes the metadata issue. J On 2 May 2017 at 17:00, David Zafman wrote: > > James, > > You have an omap corruption. It is likely caused by a bug which has > already been identifi

Re: [ceph-users] cephfs metadata damage and scrub error

2017-05-02 Thread David Zafman
James, You have an omap corruption. It is likely caused by a bug which has already been identified. A fix for that problem is available but it is still pending backport for the next Jewel point release. All 4 of your replicas have different "omap_digest" values. Instead of the xattrs

[ceph-users] cephfs metadata damage and scrub error

2017-05-02 Thread James Eckersall
Hi, I'm having some issues with a ceph cluster. It's an 8 node cluster rnning Jewel ceph-10.2.7-0.el7.x86_64 on CentOS 7. This cluster provides RBDs and a CephFS filesystem to a number of clients. ceph health detail is showing the following errors: pg 2.9 is active+clean+inconsistent, acting [3