Re: [ceph-users] cephfs-data-scan

Sergey Malinin Sun, 04 Nov 2018 01:40:01 -0700

Keep in mind that in order for the workers not to overlap each other you need 
to set the total number of workers (worker_m) to nodes*20, and assign each node 
with it’s own processing range (worker_n).
On Nov 4, 2018, 03:43 +0300, Rhian Resnick <xan...@sepiidae.com>, wrote:
> Sounds like we are going to restart with 20 threads on each storage node.
>
> > On Sat, Nov 3, 2018 at 8:26 PM Sergey Malinin <h...@newmail.com> wrote:
> > > scan_extents using 8 threads took 82 hours for my cluster holding 120M 
> > > files on 12 OSDs with 1gbps between nodes. I would have gone with lot 
> > > more threads if I had known it only operated on data pool and the only 
> > > problem was network latency. If I recall correctly, each worker used up 
> > > to 800mb ram so beware the OOM killer.
> > > scan_inodes runs several times faster but I don’t remember exact timing.
> > > In your case I believe scan_extents & scan_inodes can be done in a few 
> > > hours by running the tool on each OSD node, but scan_links will be 
> > > painfully slow due to it’s single-threaded nature.
> > > In my case I ended up getting MDS to start and copied all data to a fresh 
> > > filesystem ignoring errors.
> > > On Nov 4, 2018, 02:22 +0300, Rhian Resnick <xan...@sepiidae.com>, wrote:
> > > > For a 150TB file system with 40 Million files how many cephfs-data-scan 
> > > > threads should be used? Or what is the expected run time. (we have 160 
> > > > osd with 4TB disks.)
> > > > _______________________________________________
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs-data-scan

Reply via email to