[ceph-users] Re: pools nearfull (because cephfs is lazy in releasing space for deleted files ?)

Venky Shankar Thu, 30 Oct 2025 01:07:34 -0700

Hi Massimo,

On Wed, Oct 29, 2025 at 8:19 PM Massimo Sgaravatto
<[email protected]> wrote:
>
> Hi Venky
> Thanks for your answer
>
> No: we are not using snapshots


Was the MDS cache memory running close to mds_cache_memory_limit? This
is available in perf dump via

        $ ceph tell mds.<id> perf dump

and look for mds_co_bytes. Or if you can recreate the issue and
capture these details.

And what's mds_cache_memory_limit set to BTW?

>
> Regards, Massimo
>
> On Wed, Oct 29, 2025 at 3:05 PM Venky Shankar <[email protected]> wrote:
>>
>> Hi Massimo,
>>
>> On Tue, Oct 28, 2025 at 5:30 PM Massimo Sgaravatto
>> <[email protected]> wrote:
>> >
>> > Dear all
>> >
>> > We have a portion of a Cephfs file system that maps to a ceph pool called
>> > cephfs_data_ssd.
>> >
>> >
>> > If I perform a "du -sh" on this portion of the file system, I see that the
>> > value matches the "STORED" field of the "ceph df" output for the
>> > cephfs_data_ssd pool.
>> >
>> > So far so good.
>> >
>> > I set a quota of 4.5 TB for this file system area.
>> >
>> >
>> > During the weekend, this pool (and other pools of the same device class)
>> > became nearfull.
>> >
>> > A "ceph df" showed that the problem was indeed in the the cephfs_data_ssd
>> > pool, with a reported usage of 7 TiB of data (21 TiB in replica 3):
>> >
>> > cephfs_data_ssd 62 32 7.1 TiB 2.02M 21 TiB 89.06 898 GiB
>> >
>> >
>> > This sounds strange to me because I set a quota of 4.5 TB in that area, and
>> > because a "du -sh" of the relevant directory showed a usage of 600 GB.
>> >
>> >
>> > When I lowered the disk quota from 4.5 TB to 600 GB, the jobs writing in
>> > that
>> > area failed (because of disk quota exceeded) and after a while the space 
>> > was
>> > released.
>> >
>> >
>> > The only explanation I can think of is that, as far as I understand,
>> > cephfs can take a while to release the space for deleted files
>> > (https://docs.ceph.com/en/reef/dev/delayed-delete/).
>> >
>> >
>> > This would also be consistent with the fact that it looks like some jobs
>> > were performing a lot of writes and deletions (they kept writing a ~ 5GB
>> > checkpoint file, and deleting the previous one after each iteration).
>>
>> That's likely what is causing the high pool usage -- the files are
>> logically gone (du doesn't see them), but the objects are still lying
>> in the data pool consuming space which aren't getting deleted by the
>> purge queue in the MDS for some reason. Do you use snapshots?
>>
>> >
>> >
>> >
>> > How can I understand from the log files if this was indeed the problem ?
>> >
>> > Or do you have some other possible explanations for this problem ?
>> >
>> > And, most important, how can I prevent scenarios such as this one ?
>> >
>> > Thanks, Massimo
>> > _______________________________________________
>> > ceph-users mailing list -- [email protected]
>> > To unsubscribe send an email to [email protected]
>> >
>>
>>
>> --
>> Cheers,
>> Venky
>>


-- 
Cheers,
Venky
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: pools nearfull (because cephfs is lazy in releasing space for deleted files ?)

Reply via email to