Thank you guys for help diagnosing this issue.

Once rebalancing finished and everything was active+clean cluster_osdmap_trim_lower_bound increased on OSDs. OSDs are slowly cleaning up old maps and we are regaining free space on OSDs. It may take over a week to get rid off them all but we are no longer in danger.

Best regards
Adam Prycki

W dniu 6.05.2025 o 20:10, Anthony D'Atri pisze:


I guess we will wait 3-4 days for cluster recovery to finish and I will update 
you if something happens once it's HEALTH_OK.

Is there a way to force epoch/osdmap trimming on rebalancing cluster?

I would use 
https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py

A pass or two of that will enter upmaps to obviate the current backfill.  If 
you also temporarily disable the balancer you should very quickly get to a 
point with no backfill_wait.  Then after the trimming and the dust settles you 
should re-enable the balancer to finish.

Afaik there no way to do it. It would be nice if we had such ability. It's not 
the first time we have this class of issues (big mon DBs). I would consider it 
a major flaw of the ceph. It cannot get healthier until it's not in a perfectly 
healthy state.

Best regards
Adam Prycki
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Attachment: smime.p7s
Description: Kryptograficzna sygnatura S/MIME

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to