Thank you guys for help diagnosing this issue.Once rebalancing finished and everything was active+clean cluster_osdmap_trim_lower_bound increased on OSDs. OSDs are slowly cleaning up old maps and we are regaining free space on OSDs. It may take over a week to get rid off them all but we are no longer in danger.
Best regards Adam Prycki W dniu 6.05.2025 o 20:10, Anthony D'Atri pisze:
I guess we will wait 3-4 days for cluster recovery to finish and I will update you if something happens once it's HEALTH_OK. Is there a way to force epoch/osdmap trimming on rebalancing cluster?I would use https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py A pass or two of that will enter upmaps to obviate the current backfill. If you also temporarily disable the balancer you should very quickly get to a point with no backfill_wait. Then after the trimming and the dust settles you should re-enable the balancer to finish.Afaik there no way to do it. It would be nice if we had such ability. It's not the first time we have this class of issues (big mon DBs). I would consider it a major flaw of the ceph. It cannot get healthier until it's not in a perfectly healthy state. Best regards Adam Prycki _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
smime.p7s
Description: Kryptograficzna sygnatura S/MIME
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io