I hope someone can help us with a MDS caching problem.

Ceph version 18.2.4 with cephadm container deployment.

Question 1:
For me it's not clear how much cache/memory you should allocate for the MDS. Is 
this based on the number of open files, caps or something else?

Question 2/Problem:
At the moment we have MDS nodes with 32 GB of memory and a configured cache 
limit of 20 GB. There are 4 MDS nodes: 2 active and 2 in standby-replay mode 
(with max_mds set at 2 of course). We pinned top directories to specific ranks, 
so the balancer isn't used.
The memory usage is for the most part increasing, sometimes a little dip with 
couple hundred MB's freed. After all the memory is consumed, SWAP gets used. 
This results in a couple of hundred MB's of freed memory, but not much. When 
eventually the SWAP runs out and the memory is full, the MDS service stops and 
the cluster logs show:
1. no beacon from mds
2. marking mds up:active laggy
3. replacing mds
4. MDS daemon <daemon> is removed because it is dead or otherwise unavailable

For example: we have the top folder app2 and app4 which is pinnend to rank 1. 
Folder app2 is always accessed by 4 clients (application servers), the same 
happens with folder app4. Folder app2 is 3 times larger than folder app4 (last 
time I checked, don't wanna do a du at the moment).
After a couple of hours the memory usage of the MDS server stays around 18% 
(Grafana shows a flatline for 7 hours).
At night the 9the client connects and makes first a backup with rsync of the 
latest snapshot folder of app2 and afterwards the same happens for folder app4 
with a pause for 5 minutes.
When the backup starts, the memory increases to 70% and stays at 70% after the 
backup of app2 is completed. 5 minutes later the memory starts increases again 
with the start of the backup of folder app4. When the backup is done, it's at 
78% and stays there for the rest of the day.
Why isn't the memory usage decreasing after the rsync is completed?

Is there a memory leak with the MDS service?

Ps. I have some small log files/Grafana screenshots, not sure how to share.

Kind regards,
Sake
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to