[ceph-users] Re: 2 MDSs behind on trimming on my Ceph Cluster since the upgrade from 18.2.6 (reef) to 19.2.2 (squid)

Dominique Ramaekers Tue, 13 May 2025 23:17:58 -0700

Hi Edouard,

I had the same problem with on of my setups. I used this article to fix it: 
https://www.suse.com/support/kb/doc/?id=000019740


A changed these values increasing/decreasing 10%:
        ceph config set mds mds_cache_trim_threshold xxK (should initially be 
increased)
        ceph config set mds mds_cache_trim_decay_rate x.x (should initially be 
decreased)
        ceph config set mds mds_cache_memory_limit xxxxxxxxxx (should initially 
be increased)
        ceph config set mds mds_recall_max_caps xxxx (should initially be 
increased)
        ceph config set mds mds_recall_max_decay_rate x.xx (should initially be 
decreased)

Dominique.

> -----Oorspronkelijk bericht-----
> Van: Frédéric Nass <frederic.n...@univ-lorraine.fr>
> Verzonden: dinsdag 13 mei 2025 15:46
> Aan: Edouard FAZENDA <e.faze...@csti.ch>
> CC: ceph-users <ceph-users@ceph.io>; Kelian SAINT-BONNET
> <k.saintbon...@csti.ch>
> Onderwerp: [ceph-users] Re: 2 MDSs behind on trimming on my Ceph
> Cluster since the upgrade from 18.2.6 (reef) to 19.2.2 (squid)
> 
> Hi Edouard,
> 
> Sorry to hear that, although I'm not that surprised. I think you'll have to 
> wait
> for the fix.
> 
> Regards,
> Frédéric.
> 
> ----- Le 13 Mai 25, à 15:41, Edouard FAZENDA <e.faze...@csti.ch> a écrit :
> 
> > Dear Frederic,
> 
> > I have applied the settings you have provided, unfortunately the
> > cluster was back green and yellow afterward with still MDS on trimming
> again.
> 
> > Thanks for the help
> 
> > Best Regards,
> 
> > [ https://www.csti.ch/ ]
> > S wiss C loud P rovider
> 
> 
> 
> > Edouard Fazenda
> 
> > Technical Support
> 
> > [ https://www.csti.ch/ ]
> 
> > Chemin du Curé-Desclouds, 2
> > CH-1226 Thonex
> > +41 22 869 04 40
> 
> > [ https://mail.univ-lorraine.fr/www.csti.ch | www.csti.ch ]
> 
> > From: Frédéric Nass <frederic.n...@univ-lorraine.fr>
> > Sent: vendredi, 9 mai 2025 14:20
> > To: Edouard FAZENDA <e.faze...@csti.ch>
> > Cc: ceph-users <ceph-users@ceph.io>; Kelian SAINT-BONNET
> > <k.saintbon...@csti.ch>
> > Subject: Re: [ceph-users] 2 MDSs behind on trimming on my Ceph Cluster
> > since the upgrade from 18.2.6 (reef) to 19.2.2 (squid)
> 
> > ----- Le 9 Mai 25, à 14:10, Frédéric Nass < [
> > mailto:frederic.n...@univ-lorraine.fr | frederic.n...@univ-lorraine.fr
> > ] > a écrit :
> 
> >> Hi Edouard,
> 
> >> ----- Le 8 Mai 25, à 10:15, Edouard FAZENDA < [
> >> mailto:e.faze...@csti.ch | e.faze...@csti.ch ] > a écrit :
> 
> >>> Dear all,
> 
> >>> I have the following issue on my Ceph cluster MDSs behind on
> >>> trimming on my Ceph Cluster since the upgrade using cephadm from
> 18.2.6 to 19.2.2.
> 
> >>> Here some cluster logs :
> 
> >>> 8/5/25 09:00 AM [WRN] overall HEALTH_WARN 2 MDSs behind on
> trimming
> 
> >>> 8/5/25 08:50 AM [WRN] overall HEALTH_WARN 2 MDSs behind on
> trimming
> 
> >>> 8/5/25 08:40 AM [WRN] mds.cephfs.node2.isqjza(mds.0): Behind on
> >>> trimming
> >>> (326/128) max_segments: 128, num_segments: 326
> 
> >>> 8/5/25 08:40 AM [WRN] mds.cephfs.node1.ojmpnk(mds.0): Behind on
> >>> trimming
> >>> (326/128) max_segments: 128, num_segments: 326
> 
> >>> 8/5/25 08:40 AM [WRN] [WRN] MDS_TRIM: 2 MDSs behind on trimming
> 
> >>> 8/5/25 08:40 AM [WRN] Health detail: HEALTH_WARN 2 MDSs behind on
> >>> trimming
> 
> >>> 8/5/25 08:33 AM [WRN] Health check update: 2 MDSs behind on
> trimming
> >>> (MDS_TRIM)
> 
> >>> 8/5/25 08:33 AM [WRN] Health check failed: 1 MDSs behind on trimming
> >>> (MDS_TRIM)
> 
> >>> 8/5/25 08:30 AM [INF] overall HEALTH_OK
> 
> >>> 8/5/25 08:22 AM [INF] Cluster is now healthy
> 
> >>> 8/5/25 08:22 AM [INF] Health check cleared: MDS_TRIM (was: 1 MDSs
> >>> behind on
> >>> trimming)
> 
> >>> 8/5/25 08:22 AM [INF] MDS health message cleared (mds.?): Behind on
> >>> trimming
> >>> (525/128)
> 
> >>> 8/5/25 08:22 AM [WRN] Health check update: 1 MDSs behind on
> trimming
> >>> (MDS_TRIM)
> 
> >>> 8/5/25 08:22 AM [INF] MDS health message cleared (mds.?): Behind on
> >>> trimming
> >>> (525/128)
> 
> >>> 8/5/25 08:20 AM [WRN] overall HEALTH_WARN 2 MDSs behind on
> trimming
> 
> >>> 8/5/25 08:10 AM [WRN] mds.cephfs.node2.isqjza(mds.0): Behind on
> >>> trimming
> >>> (332/128) max_segments: 128, num_segments: 332
> 
> >>> 8/5/25 08:10 AM [WRN] mds.cephfs.node1.ojmpnk(mds.0): Behind on
> >>> trimming
> >>> (332/128) max_segments: 128, num_segments: 332
> 
> >>> 8/5/25 08:10 AM [WRN] [WRN] MDS_TRIM: 2 MDSs behind on trimming
> 
> >>> 8/5/25 08:10 AM [WRN] Health detail: HEALTH_WARN 2 MDSs behind on
> >>> trimming
> 
> >>> 8/5/25 08:03 AM [WRN] Health check update: 2 MDSs behind on
> trimming
> >>> (MDS_TRIM)
> 
> >>> 8/5/25 08:03 AM [WRN] Health check failed: 1 MDSs behind on trimming
> >>> (MDS_TRIM)
> 
> >>> 8/5/25 08:00 AM [INF] overall HEALTH_OK
> 
> >>> #ceph fs status
> 
> >>> cephfs - 50 clients
> 
> >>> ======
> 
> >>> RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS
> 
> >>> 0 active cephfs.node1.ojmpnk Reqs: 10 /s 305k 294k 91.8k 6818
> 
> >>> 0-s standby-replay cephfs.node2.isqjza Evts: 0 /s 551k 243k 90.6k 0
> 
> >>> POOL TYPE USED AVAIL
> 
> >>> cephfs_metadata metadata 2630M 2413G
> 
> >>> cephfs_data data 12.7T 3620G
> 
> >>> STANDBY MDS
> 
> >>> cephfs.node3.vdicdn
> 
> >>> MDS version: ceph version 19.2.2
> >>> (0eceb0defba60152a8182f7bd87d164b639885b8)
> >>> squid (stable)
> 
> >>> # ceph versions
> 
> >>> {
> 
> >>> "mon": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 3
> 
> >>> },
> 
> >>> "mgr": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 2
> 
> >>> },
> 
> >>> "osd": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 18
> 
> >>> },
> 
> >>> "mds": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 3
> 
> >>> },
> 
> >>> "rgw": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 6
> 
> >>> },
> 
> >>> "overall": {
> 
> >>> "ceph version 19.2.2 (0eceb0defba60152a8182f7bd87d164b639885b8)
> squid (stable)":
> >>> 32
> 
> >>> }
> 
> >>> }
> 
> >>> #ceph orch ps --daemon-type mds
> 
> >>> NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM
> VERSION IMAGE
> >>> ID CONTAINER ID
> 
> >>> mds.cephfs.node1.ojmpnk rke-sh1-1 running (18h) 4m ago 19M 1709M -
> >>> 19.2.2 4892a7ef541b 8dd8db30a1de
> 
> >>> mds.cephfs.node2.isqjza rke-sh1-2 running (18h) 2m ago 3y 1720M -
> >>> 19.2.2 4892a7ef541b 7b9d5b692764
> 
> >>> mds.cephfs.node3.vdicdn rke-sh1-3 running (18h) 108s ago 18M 27.9M -
> >>> 19.2.2 4892a7ef541b d2de22a15e18
> 
> >>> root@node1:~# ceph config show-with-defaults
> >>> mds.cephfs.rke-sh1-3.vdicdn | egrep
> "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_me
> mory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"
> 
> >>> mds_cache_memory_limit 4294967296 default
> 
> >>> mds_cache_trim_decay_rate 1.000000 default
> 
> >>> mds_cache_trim_threshold 262144 default
> 
> >>> mds_recall_max_caps 30000 default
> 
> >>> mds_recall_max_decay_rate 1.500000 default
> 
> >>> root@node2:~# ceph config show-with-defaults
> >>> mds.cephfs.rke-sh1-2.isqjza | egrep
> "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_me
> mory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"
> 
> >>> mds_cache_memory_limit 4294967296 default
> 
> >>> mds_cache_trim_decay_rate 1.000000 default
> 
> >>> mds_cache_trim_threshold 262144 default
> 
> >>> mds_recall_max_caps 30000 default
> 
> >>> mds_recall_max_decay_rate 1.500000 default
> 
> >>> root@node3:~# ceph config show-with-defaults
> >>> mds.cephfs.rke-sh1-1.ojmpnk | egrep
> "mds_cache_trim_threshold|mds_cache_trim_decay_rate|mds_cache_me
> mory_limit|mds_recall_max_caps|mds_recall_max_decay_rate"
> 
> >>> mds_cache_memory_limit 4294967296 default
> 
> >>> mds_cache_trim_decay_rate 1.000000 default
> 
> >>> mds_cache_trim_threshold 262144 default
> 
> >>> mds_recall_max_caps 30000 default
> 
> >>> mds_recall_max_decay_rate 1.500000 default
> 
> >>> # ceph mds stat
> 
> >>> cephfs:1 {0=cephfs.node1.ojmpnk=up:active} 1 up:standby-replay 1
> >>> up:standby
> 
> >>> Do you have an idea on what could happen ? Should I increate
> >>> mds_cache_trim_decay_rate ?
> 
> >>> I saw the folloing issue : [ https://tracker.ceph.com/issues/66948 | Bug
> #66948:
> >>> mon.a (mon.0) 326 : cluster [WRN] Health check failed: 1 MDSs behind
> >>> on trimming (MDS_TRIM)" in cluster log - CephFS - Ceph ] ( [
> >>> https://github.com/ceph/ceph/pull/60838 | squid: mds: trim mdlog
> >>> when segments exceed threshold and trim was idle by vshankar · Pull
> >>> Request #60838 · ceph/ceph · GitHub ] ) maybe related ?
> >> There's a fair chance, yes as you said MDS_TRIM alert came after the
> >> upgrade, Reef is immune to this bug (as not using major/minor log
> >> segment changes) and Squid v19.2.2 does not contain the fix.
> 
> >> Wou could try decreasing mds_recall_max_decay_rate to 1 (instead of
> >> 1.5)
> 
> >> Regards,
> 
> >> Frédéric.
> 
> > Apologies, the email was sent prematurely...
> 
> > What you could try waiting for the fix is to:
> 
> > - increase mds_log_max_segments to 256 (defaults to 128)
> 
> > And eventually:
> 
> > - reduce mds_recall_max_decay_rate to 1 (defaults to 1.5)
> 
> > - reduce mds_recall_max_decay_threshold to 32K (defaults to 128K)
> 
> > - increase mds_recall_global_max_decay_threshold to 256K (defaults to
> > 128K)
> 
> > to allow the MDS to reclaim client caps in a more aggressive manner,
> > though I'm not sure this will prevent MDS_TRIM alert from being triggered.
> 
> > Regards,
> 
> > Frédéric.
> 
> >>> Thanks for the help 😊
> 
> >>> Best Regards, Edouard Fazenda.
> 
> >>> [ https://www.csti.ch/ ]
> >>> S wiss C loud P rovider
> 
> 
> 
> >>> Edouard Fazenda
> 
> >>> Technical Support
> 
> >>> [ https://www.csti.ch/ ]
> 
> >>> Chemin du Curé-Desclouds, 2
> >>> CH-1226 Thonex
> >>> +41 22 869 04 40
> 
> >>> [ https://mail.univ-lorraine.fr/www.csti.ch | www.csti.ch ]
> 
> >>> _______________________________________________
> >>> ceph-users mailing list -- [ mailto:ceph-users@ceph.io |
> >>> ceph-users@ceph.io ] To unsubscribe send an email to [
> >>> mailto:ceph-users-le...@ceph.io | ceph-users-le...@ceph.io ]
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email
> to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: 2 MDSs behind on trimming on my Ceph Cluster since the upgrade from 18.2.6 (reef) to 19.2.2 (squid)

Reply via email to