Hi Josh, That's another interesting dimension... Indeed a cluster that has plenty of free capacity could indeed be balanced by workload/iops, but once it reaches maybe 60 or 70% full, then I think capacity would need to take priority.
But to be honest I don't really understand the workload/iops balancing use-case. Can you describe some of the scenarios you have in mind? .. Dan On Wed, 20 Oct 2021, 20:45 Josh Salomon, <jsalo...@redhat.com> wrote: > Just another point of view: > The current balancer balances the capacity but this is not enough. The > balancer should also balance the workload and we plan on adding primary > balancing for Quincy. In order to balance the workload you should work pool > by pool because pools have different workloads. So while the observation > about the +1 PGs is correct, I believe the correct solution should be > talking this into consideration while still balancing capacity pool by pool. > Capacity balancing is a functional requirement, while workload balancing > is a performance requirement so it is important only for very loaded > systems (loaded in terms of high IOPS not nearly full systems) > > I would appreciate comments on this thought. > > On Wed, 20 Oct 2021, 20:57 Dan van der Ster, <d...@vanderster.com> wrote: > >> Hi Jonas, >> >> From your readme: >> >> "the best possible solution is some OSDs having an offset of 1 PG to the >> ideal count. As a PG-distribution-optimization is done per pool, without >> checking other pool's distribution at all, some devices will be the +1 more >> often than others. At worst one OSD is the +1 for each pool in the cluster." >> >> That's an interesting observation/flaw which hadn't occurred to me >> before. I think we don't ever see it in practice in our clusters because we >> do not have multiple large pools on the same osds. >> >> How large are the variances in your real clusters? I hope the example in >> your readme isn't from real life?? >> >> Cheers, Dan >> >> >> >> >> >> >> >> >> >> >> >> >> On Wed, 20 Oct 2021, 15:11 Jonas Jelten, <jel...@in.tum.de> wrote: >> >>> Hi! >>> >>> I've been working on this for quite some time now and I think it's ready >>> for some broader testing and feedback. >>> >>> https://github.com/TheJJ/ceph-balancer >>> >>> It's an alternative standalone balancer implementation, optimizing for >>> equal OSD storage utilization and PG placement across all pools. >>> >>> It doesn't change your cluster in any way, it just prints the commands >>> you can run to apply the PG movements. >>> Please play around with it :) >>> >>> Quickstart example: generate 10 PG movements on hdd to stdout >>> >>> ./placementoptimizer.py -v balance --max-pg-moves 10 >>> --only-crushclass hdd | tee /tmp/balance-upmaps >>> >>> When there's remapped pgs (e.g. by applying the above upmaps), you can >>> inspect progress with: >>> >>> ./placementoptimizer.py showremapped >>> ./placementoptimizer.py showremapped --by-osd >>> >>> And you can get a nice Pool and OSD usage overview: >>> >>> ./placementoptimizer.py show --osds --per-pool-count >>> --sort-utilization >>> >>> >>> Of course there's many more features and optimizations to be added, >>> but it served us very well in reclaiming terrabytes of until then >>> unavailable storage already where the `mgr balancer` could no longer >>> optimize. >>> >>> What do you think? >>> >>> Cheers >>> -- Jonas >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> >> _______________________________________________ >> Dev mailing list -- d...@ceph.io >> To unsubscribe send an email to dev-le...@ceph.io >> > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io