Yes.  See a recent thread regarding just this.

I helped a community member whose users had deleted like 250000000 tiny files; 
the lazy reclamation was crawling with the default settings.

https://lists.ceph.io/hyperkitty/list/[email protected]/thread/DP7D77KUWTYGU7OZ6M7MBXN3BOKAESEG/#74ERTBDBEVSB2PIBEPAYTCOBH3OMLJKE
 

https://docs.clyso.com/docs/kb/cephfs/data-growth-purge-queue/

> On Nov 19, 2025, at 5:25 AM, Massimo Sgaravatto 
> <[email protected]> wrote:
> 
> Dear all
> 
> In our cephfs installation I saw a couple of times a discrepancy between the
> "ceph df" and the "du -sh" outputs (see also this thread:
> https://lists.ceph.io\
> /hyperkitty/list/[email protected]/thread/BBED4ADXO3CE4FYLCNUWB4OML6N6CTZU/
> ).
> 
> 
> I have the feeling that the problem is because of a lot of deletions done
> by some users (I know that some of them in their jobs use to write from
> time to time a checkpoint file and then delete the previous checkpoint file)
> 
> 
> 
> Trying to reproduce the problem, I am running a script with 30 parallel
> threads
> where each thread:
> - writes a 40 GB file
> - sleeps 5 secs
> - deletes the file produced in the previous iteration
> 
> After some hours I have been able to reproduce the issue. Right now
> "ceph df" shows a usage of ~ 11 TB (~ 33 TB considering the replica 3) while
> "du -sh" shows a usage of about 2.7 TB.
> 
> 
> We have 2 active (and 1 standby) MDS instances.
> 
> 
> In one MDS I see:
> 
> [root@ceph-mds-01 ~]# ceph --admin-daemon
> /run/ceph/ceph-mds.ceph-mds-01.asok perf dump | jq '.["purge_queue"]'
> {
>  "pq_executing_ops": 10240,
>  "pq_executing_ops_high_water": 13202,
>  "pq_executing": 1,
>  "pq_executing_high_water": 16,
>  "pq_executed_ops": 74781922,
>  "pq_executed": 924254,
>  "pq_item_in_journal": 26986
> }
> 
> while in the second one:
> 
> [root@ceph-mds-02 ~]# ceph --admin-daemon
> /run/ceph/ceph-mds.ceph-mds-02.asok perf dump | jq '.["purge_queue"]'
> {
>  "pq_executing_ops": 0,
>  "pq_executing_ops_high_water": 0,
>  "pq_executing": 0,
>  "pq_executing_high_water": 0,
>  "pq_executed_ops": 0,
>  "pq_executed": 0,
>  "pq_item_in_journal": 0
> }
> 
> I am using the default values for filer_max_purge_ops and "mds_max_purge*:
> 
> [root@ceph-mds-01 ~]# ceph daemon /run/ceph/ceph-mds.ceph-mds-01.asok
> config show | grep filer_max_purge
>    "filer_max_purge_ops": "10",
> [root@ceph-mds-01 ~]# ceph daemon /run/ceph/ceph-mds.ceph-mds-01.asok
> config show | grep mds_max_purge
>    "mds_max_purge_files": "64",
>    "mds_max_purge_ops": "8192",
>    "mds_max_purge_ops_per_pg": "0.500000",
> 
> mds_cache_memory_limit is set to 32 GiB and doing a:
> 
> # ceph daemon mds.<mds> perf dump | grep mds_co_bytes
> 
> I see for the 2 MDS instances, , respectively:
> 
> 8237145057
> 3479698
> 
> 
> We are running ceph reef (we will soon update to squid)
> 
> 
> Should I try to increase filer_max_purge_ops ?
> 
> 
> Thanks a lot
> Cheers, Massimo
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to