[ceph-users] CephPGImbalance warning for different OSD sizes

Eugen Block Tue, 13 May 2025 06:58:15 -0700

Hi *,

since the question came up yesterday on this list, I decided to shareour workaround. I created a tracker issue [0] as well as a blog post[1] with some more details.The current Prometheus module relies on uniform OSD sizes, which isnot very common, at least not from our experience or the reports onthis list.

So we added a new metric to the mgr prometheus module and modified thealert expression which only compares OSDs of the same size (since thecrush weight is calculated similarly, we just called the metricosd_crush_weight). This is a bit hacky, not persistent across updatesetc., but it has worked great so far. It would be great if the mgrmodule could be improved. I'm sure there are more elegant ways to dothat, but with this approach we didn't need to introduce anything new,just utilized what was already there.


Best regards,
Eugen

[0] https://tracker.ceph.com/issues/71310
[1] https://heiterbiswolkig.blogs.nde.ag/2025/05/13/cephadm-pg-imbalance/
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] CephPGImbalance warning for different OSD sizes

Reply via email to