scan_extents using 8 threads took 82 hours for my cluster holding 120M files on 
12 OSDs with 1gbps between nodes. I would have gone with lot more threads if I 
had known it only operated on data pool and the only problem was network 
latency. If I recall correctly, each worker used up to 800mb ram so beware the 
OOM killer.
scan_inodes runs several times faster but I don’t remember exact timing.
In your case I believe scan_extents & scan_inodes can be done in a few hours by 
running the tool on each OSD node, but scan_links will be painfully slow due to 
it’s single-threaded nature.
In my case I ended up getting MDS to start and copied all data to a fresh 
filesystem ignoring errors.
On Nov 4, 2018, 02:22 +0300, Rhian Resnick <xan...@sepiidae.com>, wrote:
> For a 150TB file system with 40 Million files how many cephfs-data-scan 
> threads should be used? Or what is the expected run time. (we have 160 osd 
> with 4TB disks.)
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to