So, I was running Ceph 10.2.9 servers, with 10.2.6 (I think, what is in 
CentOS’s Jewel-SIG repo?), clients.

I had an issue where the MDS cluster stopped working, wasn’t responding to 
cache pressure, and I restarted the mdd’s and they failed to replay the 
journal. 

Long story short, I managed to get things sort of working, I upgraded to 
Luminous 12.1.4rc because it had the more developed cephfs-data-scan tools with 
scan_links (Jewel did not). Even though things are mostly working, there is 
obviously still some corruption in links and metadata, as I’m getting logs of 
them.

What I need to know is, how can I fix this so that I clear all the data 
corruption? I’ve gone through the steps documented in the disaster recovery. 
I’m doing a last ditch attempt to re-order how I do things just a little by 
running scan_frags, then scan_extents and scan_inodes, hoping that it can 
repair some of the damage.

At the very least what I want, since nothing important seems to be 
corrupted/damaged, is to repair or delete the damaged links/references, and 
clear up all that so things run reliably again.

Eric Renfro
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to