But with enough memory on MDS, I can just cache all metadata into memory. Right now there are around 500GB metadata in the ssd. So this is not enough?
On Tue, Jan 22, 2019 at 5:48 PM Yan, Zheng <uker...@gmail.com> wrote: > On Tue, Jan 22, 2019 at 10:49 AM Albert Yue <transuranium....@gmail.com> > wrote: > > > > Hi Yan Zheng, > > > > In your opinion, can we resolve this issue by move MDS to a 512GB or 1TB > memory machine? > > > > The problem is from client side, especially clients with large memory. > I don't think enlarge mds cache size is good idea. you can > periodically check periodically > each kernel clients' /sys/kernel/debug/ceph/xxx/caps. run 'echo 2 > >/proc/sys/vm/drop_caches' if a client used too many caps (for example > 10k), > > > On Mon, Jan 21, 2019 at 10:49 PM Yan, Zheng <uker...@gmail.com> wrote: > >> > >> On Mon, Jan 21, 2019 at 11:16 AM Albert Yue <transuranium....@gmail.com> > wrote: > >> > > >> > Dear Ceph Users, > >> > > >> > We have set up a cephFS cluster with 6 osd machines, each with 16 8TB > harddisk. Ceph version is luminous 12.2.5. We created one data pool with > these hard disks and created another meta data pool with 3 ssd. We created > a MDS with 65GB cache size. > >> > > >> > But our users are keep complaining that cephFS is too slow. What we > observed is cephFS is fast when we switch to a new MDS instance, once the > cache fills up (which will happen very fast), client became very slow when > performing some basic filesystem operation such as `ls`. > >> > > >> > >> It seems that clients hold lots of unused inodes their icache, which > >> prevent mds from trimming corresponding objects from its cache. mimic > >> has command "ceph daemon mds.x cache drop" to ask client to drop its > >> cache. I'm also working on a patch that make kclient client release > >> unused inodes. > >> > >> For luminous, there is not much we can do, except periodically run > >> "echo 2 > /proc/sys/vm/drop_caches" on each client. > >> > >> > >> > What we know is our user are putting lots of small files into the > cephFS, now there are around 560 Million files. We didn't see high CPU wait > on MDS instance and meta data pool just used around 200MB space. > >> > > >> > My question is, what is the relationship between the metadata pool > and MDS? Is this performance issue caused by the hardware behind meta data > pool? Why the meta data pool only used 200MB space, and we saw 3k iops on > each of these three ssds, why can't MDS cache all these 200MB into memory? > >> > > >> > Thanks very much! > >> > > >> > > >> > Best Regards, > >> > > >> > Albert > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com