Hello Venky

Thanks for your help in debugging this issue

I am using the default value for mds_cache_memory_limit (4 GiB).
Should I increase this value since the server hosting the MDS has much more
memory ?
 Are there some guidelines on how to set this parameter wrt the physical
memory available in the server hosting the mds daemon ? I can't find this
in:

https://docs.ceph.com/en/reef/cephfs/cache-configuration/


Thanks, Massimo



On Thu, Oct 30, 2025 at 9:07 AM Venky Shankar <[email protected]> wrote:

> Hi Massimo,
>
> On Wed, Oct 29, 2025 at 8:19 PM Massimo Sgaravatto
> <[email protected]> wrote:
> >
> > Hi Venky
> > Thanks for your answer
> >
> > No: we are not using snapshots
>
> Was the MDS cache memory running close to mds_cache_memory_limit? This
> is available in perf dump via
>
>         $ ceph tell mds.<id> perf dump
>
> and look for mds_co_bytes. Or if you can recreate the issue and
> capture these details.
>
> And what's mds_cache_memory_limit set to BTW?
>
> >
> > Regards, Massimo
> >
> > On Wed, Oct 29, 2025 at 3:05 PM Venky Shankar <[email protected]>
> wrote:
> >>
> >> Hi Massimo,
> >>
> >> On Tue, Oct 28, 2025 at 5:30 PM Massimo Sgaravatto
> >> <[email protected]> wrote:
> >> >
> >> > Dear all
> >> >
> >> > We have a portion of a Cephfs file system that maps to a ceph pool
> called
> >> > cephfs_data_ssd.
> >> >
> >> >
> >> > If I perform a "du -sh" on this portion of the file system, I see
> that the
> >> > value matches the "STORED" field of the "ceph df" output for the
> >> > cephfs_data_ssd pool.
> >> >
> >> > So far so good.
> >> >
> >> > I set a quota of 4.5 TB for this file system area.
> >> >
> >> >
> >> > During the weekend, this pool (and other pools of the same device
> class)
> >> > became nearfull.
> >> >
> >> > A "ceph df" showed that the problem was indeed in the the
> cephfs_data_ssd
> >> > pool, with a reported usage of 7 TiB of data (21 TiB in replica 3):
> >> >
> >> > cephfs_data_ssd 62 32 7.1 TiB 2.02M 21 TiB 89.06 898 GiB
> >> >
> >> >
> >> > This sounds strange to me because I set a quota of 4.5 TB in that
> area, and
> >> > because a "du -sh" of the relevant directory showed a usage of 600 GB.
> >> >
> >> >
> >> > When I lowered the disk quota from 4.5 TB to 600 GB, the jobs writing
> in
> >> > that
> >> > area failed (because of disk quota exceeded) and after a while the
> space was
> >> > released.
> >> >
> >> >
> >> > The only explanation I can think of is that, as far as I understand,
> >> > cephfs can take a while to release the space for deleted files
> >> > (https://docs.ceph.com/en/reef/dev/delayed-delete/).
> >> >
> >> >
> >> > This would also be consistent with the fact that it looks like some
> jobs
> >> > were performing a lot of writes and deletions (they kept writing a ~
> 5GB
> >> > checkpoint file, and deleting the previous one after each iteration).
> >>
> >> That's likely what is causing the high pool usage -- the files are
> >> logically gone (du doesn't see them), but the objects are still lying
> >> in the data pool consuming space which aren't getting deleted by the
> >> purge queue in the MDS for some reason. Do you use snapshots?
> >>
> >> >
> >> >
> >> >
> >> > How can I understand from the log files if this was indeed the
> problem ?
> >> >
> >> > Or do you have some other possible explanations for this problem ?
> >> >
> >> > And, most important, how can I prevent scenarios such as this one ?
> >> >
> >> > Thanks, Massimo
> >> > _______________________________________________
> >> > ceph-users mailing list -- [email protected]
> >> > To unsubscribe send an email to [email protected]
> >> >
> >>
> >>
> >> --
> >> Cheers,
> >> Venky
> >>
>
>
> --
> Cheers,
> Venky
>
>
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to