On Mon, Jan 28, 2019 at 10:34 AM Albert Yue <transuranium....@gmail.com> wrote: > > Hi Yan Zheng, > > Our clients are also complaining about operations like 'du' or 'ncdu' being > very slow. Is there any alternative tool for such kind of operation on > CephFS? Thanks! >
'du' traverse whole directory tree to calculate size. ceph supports recursive stat. google it for detail > Best regards, > Albert > > On Wed, Jan 23, 2019 at 11:04 AM Yan, Zheng <uker...@gmail.com> wrote: >> >> On Wed, Jan 23, 2019 at 10:02 AM Albert Yue <transuranium....@gmail.com> >> wrote: >> > >> > But with enough memory on MDS, I can just cache all metadata into memory. >> > Right now there are around 500GB metadata in the ssd. So this is not >> > enough? >> > >> >> mds needs to tracking lots of extra information for each object. For >> 500G metadata, mds may need 1T or more memory. >> >> > On Tue, Jan 22, 2019 at 5:48 PM Yan, Zheng <uker...@gmail.com> wrote: >> >> >> >> On Tue, Jan 22, 2019 at 10:49 AM Albert Yue <transuranium....@gmail.com> >> >> wrote: >> >> > >> >> > Hi Yan Zheng, >> >> > >> >> > In your opinion, can we resolve this issue by move MDS to a 512GB or >> >> > 1TB memory machine? >> >> > >> >> >> >> The problem is from client side, especially clients with large memory. >> >> I don't think enlarge mds cache size is good idea. you can >> >> periodically check periodically >> >> each kernel clients' /sys/kernel/debug/ceph/xxx/caps. run 'echo 2 >> >> >/proc/sys/vm/drop_caches' if a client used too many caps (for example >> >> 10k), >> >> >> >> > On Mon, Jan 21, 2019 at 10:49 PM Yan, Zheng <uker...@gmail.com> wrote: >> >> >> >> >> >> On Mon, Jan 21, 2019 at 11:16 AM Albert Yue >> >> >> <transuranium....@gmail.com> wrote: >> >> >> > >> >> >> > Dear Ceph Users, >> >> >> > >> >> >> > We have set up a cephFS cluster with 6 osd machines, each with 16 >> >> >> > 8TB harddisk. Ceph version is luminous 12.2.5. We created one data >> >> >> > pool with these hard disks and created another meta data pool with 3 >> >> >> > ssd. We created a MDS with 65GB cache size. >> >> >> > >> >> >> > But our users are keep complaining that cephFS is too slow. What we >> >> >> > observed is cephFS is fast when we switch to a new MDS instance, >> >> >> > once the cache fills up (which will happen very fast), client became >> >> >> > very slow when performing some basic filesystem operation such as >> >> >> > `ls`. >> >> >> > >> >> >> >> >> >> It seems that clients hold lots of unused inodes their icache, which >> >> >> prevent mds from trimming corresponding objects from its cache. mimic >> >> >> has command "ceph daemon mds.x cache drop" to ask client to drop its >> >> >> cache. I'm also working on a patch that make kclient client release >> >> >> unused inodes. >> >> >> >> >> >> For luminous, there is not much we can do, except periodically run >> >> >> "echo 2 > /proc/sys/vm/drop_caches" on each client. >> >> >> >> >> >> >> >> >> > What we know is our user are putting lots of small files into the >> >> >> > cephFS, now there are around 560 Million files. We didn't see high >> >> >> > CPU wait on MDS instance and meta data pool just used around 200MB >> >> >> > space. >> >> >> > >> >> >> > My question is, what is the relationship between the metadata pool >> >> >> > and MDS? Is this performance issue caused by the hardware behind >> >> >> > meta data pool? Why the meta data pool only used 200MB space, and we >> >> >> > saw 3k iops on each of these three ssds, why can't MDS cache all >> >> >> > these 200MB into memory? >> >> >> > >> >> >> > Thanks very much! >> >> >> > >> >> >> > >> >> >> > Best Regards, >> >> >> > >> >> >> > Albert >> >> >> > >> >> >> > _______________________________________________ >> >> >> > ceph-users mailing list >> >> >> > ceph-users@lists.ceph.com >> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com