I'm fairly new to Ceph and running Rook on a fairly small cluster (half a dozen nodes, about 15 OSDs). I notice that OSD space use can vary quite a bit - upwards of 10-20%.
In the documentation I see multiple ways of managing this, but no guidance on what the "correct" or best way to go about this is. As far as I can tell there is the balancer, manual manipulation of upmaps via the command line tools, and OSD reweight. The last two can be optimized with tools to calculate appropriate corrections. There is also the new read/active upmap (at least for non-EC pools), which is manually triggered. The balancer alone is leaving fairly wide deviations in space use, and at times during recovery this can become more significant. I've seen OSDs hit the 80% threshold and start impacting IO when the entire cluster is only 50-60% full during recovery. I've started using ceph osd reweight-by-utilization and that seems much more effective at balancing things, but this seems redundant with the balancer which I have turned on. What is generally considered the best practice for OSD balancing? -- Rich _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io